Abstract
Background Molecular ageing clocks estimate an individual’s biological age. Our aim was to compare multiple machine learning algorithms for developing ageing clocks from nuclear magnetic resonance (NMR) spectroscopy metabolomics data. To validate how well each ageing clock predicted age-related morbidity and lifespan, we assessed their associations with multiple health indicators (e.g., telomere length and frailty) and all-cause mortality.
Methods The UK Biobank is a multicentre observational health study of middle-aged and older adults. The Nightingale Health platform was used to quantify 168 circulating plasma metabolites at the baseline assessment from 2006 to 2010. We trained and internally validated 17 machine learning algorithms including regularised regression, kernel-based methods and ensembles. Metabolomic age (MileAge) delta was defined as the difference between predicted and chronological age.
Results The sample included 101,359 participants (mean age = 56.53 years, SD = 8.10). Most metabolite levels varied by chronological age. The nested cross-validation mean absolute error (MAE) ranged from 5.31 to 6.36 years. 31.76% of participants had an age-bias adjusted MileAge more than one standard deviation (3.75 years) above or below the mean. A Cubist rule-based regression model overall performed best at predicting health outcomes. The all-cause mortality hazard ratio (HR) comparing individuals with a MileAge delta more than one standard deviation above and below the mean was HR = 1.52 (95% CI 1.41-1.64, p < 0.001) over a median follow-up of 13.87 years. Individuals with an older MileAge were frailer, had shorter telomeres, were more likely to have a chronic illness and rated their health worse.
Conclusions Metabolomic ageing clocks derived from multiple machine learning algorithms were robustly associated with health indicators and mortality. Our metabolomic ageing clock (MileAge) derived from a Cubist rule-based regression model can be incorporated in research, and may find applications in health assessments, risk stratification and proactive health tracking.
Competing Interest Statement
CML is a member of the scientific advisory board of Myriad Neuroscience, has received speaker fees from SYNLAB and received consultancy fees from UCB. JM and RI declare no financial conflict of interest.
Funding Statement
This research is funded by the National Institute for Health and Care Research (NIHR) Maudsley Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. Computational analyses were supported by: King's College London. (2023). King's Computational Research, Engineering and Technology Environment (CREATE). Retrieved May 24, 2023, from https://doi.org/10.18742/rnvf-m076.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The data used are available to all bona fide researchers for health-related research that is in the public interest, subject to an application process and approval criteria. Study materials are publicly available online at http://www.ukbiobank.ac.uk. Ethical approval for the UK Biobank study has been granted by the National Information Governance Board for Health and Social Care and the NHS North West Multicentre Research Ethics Committee (11/NW/0382). No project-specific ethical approval is needed.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The data used are available to all bona fide researchers for health-related research that is in the public interest, subject to an application process and approval criteria. Study materials are publicly available online at http://www.ukbiobank.ac.uk.