Abstract
Metabolic syndrome (MetS), known to substantially lower the quality of life is associated with the increased incidence of non-communicable diseases (NCDs) such as type II diabetes mellitus, cardiovascular diseases and cancer. Evidence suggests that MetS accounts for the highest global mortality rate. For the early and accurate diagnosis of MetS, various statistical and ML techniques have been developed to support its clinical diagnosis. We performed a systematic review to investigate the various statistical and machine learning techniques (ML) that have been used to support the clinical diagnoses of MetS from the earliest studies to January 2020. Published literature relating to statistical and ML techniques for the diagnosis of MetS were identified by searching five major scientific databases: PubMed, Science Direct, IEEE Xplore, ACM digital library, and SpringerLink. Fifty-three primary studies that met the inclusion criteria were obtained after screening titles, abstracts and full text. Three main types of techniques were identified: statistical (n = 10), ML (n = 40), and risk quantification (n = 3). Standardized Z-score is the only statistical technique identified while the ML techniques include principal component analysis, confirmatory factory analysis, artificial neural networks, multiple logistics regression, decision trees, support vector machines, random forests, and Bayesian networks. The areal similarity degree risk quantification, framingham risk score and simScore were the three risk quantification techniques identified. Evidence suggests that evaluated ML techniques, with accuracy ranging from 75.5% to 98.9%, can more accurately diagnose MetS than both statistical and risk quantification techniques. The standardised Z-score is the most frequent statistical technique identified. However, highlighted proof based on performance measures indicate that the decision tree and artificial neural network ML techniques have the highest predictive performance for the prediction of MetS. Evidence suggests that more accurate diagnosis of MetS is required to evaluate the predictive performance of the statistical and ML techniques.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
"This research was supported by the Grand Challenge Grant - HTM (Wellness): GC003A-14HTM from University of Malaya and IIRG Grant (IIRG002C-19HWB) from University of Malaya."
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
No IRB was neccessary in this research work.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Abbreviations
- The following abbreviations are used in this manuscript:
- 2hPG
- 2-Hour Postload Plasma Glucose
- AACE
- American Association of Clinical Endocrinologists
- ACC
- accuracy
- AHA/NHLBI
- American Heart Association — National Heart, Lung, and Blood Institute
- AI
- Artificial Intelligence
- AIn
- Adiposity Index
- ANN
- Artificial Neural Network
- ART
- Adaptive Resonance Theory
- ARTMAP
- Adaptive Resonance Theory Mapping
- ASD
- Areal Similarity Degree
- AUC
- Area Under The ROC Curve
- BA
- Bayesian ART
- BAM
- Bayesian ARTMAP
- BFP
- Body Fat Percentage
- BMI
- Body Mass Index
- BN
- Bayesian Network
- BP
- Blood Pressure
- BkP
- Back Propagation Algorithm
- CFA
- Confirmatory Factor Analysis
- CFI
- Comparative Fitness Index
- CHAID
- Chi-Squared Automatic Interaction Detection
- CHOL
- Total Cholesterol
- CLUSTer
- Cohort Study on Clustering of Lifestyle Risk Factors and Understanding its Association with Stress on Health and Wellbeing
- cm
- Centimeter
- cMetSRS
- continous MetS risk score CVD Cardiovascular Disease DALY Disability Adjusted Life Years DBP Diastolic Blood Pressure
- DT
- Decision Tree
- EGIR
- European Group for the Study of Insulin Resistance
- FA
- Fuzzy ART
- FAM
- Fuzzy ARTMAP
- FN
- False Negative
- FP
- False Positive
- FPG
- Fasting Plasma Glucose
- FPR
- False Positive Rate
- FRS
- Framingham risk score
- FSCORE
- FSCORE
- GA
- Genetic Algorithm
- GAFAM
- Genetic Algorithm Fuzzy ARTMAP
- GBT
- Gradient Boosted Trees
- GOBAM
- Genetically Optimised Bayesian ARTMAP
- HC
- Hip Circumference
- HDL-C
- High-Density Lipoprotein Cholesterol
- HOMA-IR
- Homeostasis Model Assessment Insulin Resistance Index
- HYP
- Hypertension
- IAF
- Intra-Abdominal Fat
- IAS
- International Atherosclerosis Society
- IASO
- International Association for the Study of Obesity
- IDF
- International Diabetes Federation
- IGT
- Impaired Glucose Tolerance
- IS
- Insulin Sensitivity
- JDC
- Japanese Diagnostic Criteria
- JIS
- Joint Interim Statement
- kg/m2
- Kilogram Per Square Meter
- KNN
- K-Nearest Neighbour
- LDL-C
- Low-Density Lipoprotein
- MCC
- Matthews Correlation Coefficient
- MetS
- Metabolic Syndrome
- MetSSS
- MetS severity score
- mg/dl
- Milligram Per Deciliter
- ML
- Machine Learning
- MLR
- Multiple Logistic Regression
- mmHg
- Millimeter Of Mercury
- mmol/L
- Millimoles Per Liter
- MRF
- Metabolic Syndrome Risk Factor
- N/A
- Not Available
- NCD
- Non-Communicable Disease
- NCEP ATP III
- National Cholesterol Education Program Adult Treatment Panel III
- NHLBI
- National Heart, Lung and Blood Institute: the American Heart Association
- NPV
- Negative Predictive Value
- PCA
- Principal Component Analysis
- PCLR
- Principal Component Logistic Regression
- PMX
- Partially Mapped Crossover
- PPV
- Positive Predictive Value
- QMI
- Quantitative Metabolic Index
- QPSO
- Quantum Particle Swarm Optimisation
- REFS
- Reverse Engineering and Forward Simulation
- RF
- Random Forest
- RMSEA
- Root Mean Square Error Of Approximation
- ROC
- Receiver Operating Characteristic
- SBP
- Systolic Blood Pressure
- SEN
- Sensitivity
- SNPs
- Single Nucleotide Polymorphisms
- SPEC
- Specificity
- SQF
- Subcutaneous Fat
- SRMR
- Standardised Root Mean Square Residual
- STD
- Standard Deviation
- SVM
- Support Vector Machine
- T2DM
- Type II Diabetes Mellitus
- TG
- Triglycerides True Negative
- TP
- True Positive
- TPR
- True Positive Rate
- UCI
- Univerisity of California Irvin
- UMMC
- University Malaya Medical Centre WC Waist Circumference
- WHO
- World Health Organisation
- WHR
- Waist - Hip Ratio
- WHtR
- Waist - Height Ratio
The Chan Zuckerberg Initiative, Cold Spring Harbor Laboratory, the Sergey Brin Family Foundation, California Institute of Technology, Centre National de la Recherche Scientifique, Fred Hutchinson Cancer Center, Imperial College London, Massachusetts Institute of Technology, Stanford University, University of Washington, and Vrije Universiteit Amsterdam.