PT - JOURNAL ARTICLE AU - Clift, Ashley K. AU - Lannou, Erwann Le AU - Tighe, Christian P. AU - Shah, Sachin S. AU - Beatty, Matthew AU - Hyvärinen, Arsi AU - Lane, Stephen J. AU - Strauss, Tamir AU - Dunn, Devin D. AU - Lu, Jiahe AU - Aral, Mert AU - Vahdat, Dan AU - Ponzo, Sonia AU - Plans, David TI - Development and validation of risk scores for all-cause mortality for the purposes of a smartphone-based ‘general health score’ application: a prospective cohort study using the UK Biobank AID - 10.1101/2020.11.23.20229161 DP - 2020 Jan 01 TA - medRxiv PG - 2020.11.23.20229161 4099 - http://medrxiv.org/content/early/2020/11/24/2020.11.23.20229161.short 4100 - http://medrxiv.org/content/early/2020/11/24/2020.11.23.20229161.full AB - Background Even though established links exist between individuals behaviours and potentially adverse health outcomes, to date either univariate, simpler models or multivariate, yet difficult to employ ones, have been developed. Such models are unlikely to be successful at capturing the wider determinants of health in the broader population. Hence, there is a need for a multidimensional, yet widely employable and accessible, way to obtain a comprehensive health metric.Objective To develop and validate a novel, easily interpretable points-based health score (“C-Score”) derived from metrics measurable using smartphone components, and iterations thereof that utilise statistical modelling and machine learning approaches.Methods Comprehensive literature review to identify suitable predictor variables for inclusion in a first iteration points-based model. This was followed by a prospective cohort study in a UK Biobank population for the purposes of validating the C-Score, and developing and comparatively validating variations of the score using statistical/machine learning models to assess the balance between expediency and ease of interpretability versus model complexity. Primary and secondary outcome measures: Discrimination of a points-based score for all-cause mortality within 10 years (Harrell’s c-statistic). Discrimination and calibration of Cox proportional hazards models and machine learning models that incorporate C-Score values (or raw data inputs) and other predictors to predict risk of all-cause mortality within 10 years.Results The cohort comprised 420,560 individuals. During a cohort follow-up of 4,526,452 person-years, there were 16,188 deaths from any cause (3.85%). The points-based model had good discrimination (c-statistic = 0.66). There was a 31% relative reduction in risk of all-cause mortality per decile of increasing C-Score (hazard ratio: 0.69, 95% CI: 0.663 to 0.675). A Cox model integrating age and C-Score had improved discrimination (8% percentage points, c-statistic = 0.74) and good calibration. Machine learning approaches did not offer improved discrimination over statistical modelling.Conclusions The novel health metric (‘C-Score’) has good predictive capabilities for all-cause mortality within 10 years. Embedding C-Score within a smartphone application may represent a useful tool for democratised, individualised health risk prediction. A simple Cox model using C-Score and age optimally balances parsimony and accuracy of risk predictions and could be used to produce absolute risk estimations for application users.Competing Interest StatementAKC is a previous consultant for Huma. DP, SP, ELL, CPT, SSS, MB, AH, TS, DD, JL, MA, DV, and SL are employees of Huma Therapeutics.Funding StatementFunding for the purposes of this project was provided by a contract between Chelsea Digital Ventures and Huma Therapeutics (previously known as Medopad). Funders had no role in the data acquisition, data analysis or the write-up of this manuscript.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Access to anonymised data for the UK Biobank cohort was granted by the UK Biobank Access Management Team (application number 55668). Ethical approval was granted by the national research ethics committee (REC 16/NW/0274) for the overall UK Biobank cohort.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe UK Biobank cohort data is available to researchers as approved by the Biobank Access Management Team. Due to commercial sensitivity, we have not presented the complete raw weighting system for deriving the C-Score here: this could be made available by Huma to academic partners seeking to collaborate to externally validate C-Score models in other datasets. The R/Python code used by the investigators for Cox modelling/ML modelling can be provided on request to the authors.