ABSTRACT
Objective While many studies have consistently found incomplete reporting of regression-based prediction model studies, evidence is lacking for machine learning-based prediction model studies. We aim to systematically review the adherence of Machine Learning (ML)-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Statement.
Study design and setting We included articles reporting on development or external validation of a multivariable prediction model (either diagnostic or prognostic) developed using supervised ML for individualized predictions across all medical fields (PROSPERO, CRD42019161764). We searched PubMed from 1 January 2018 to 31 December 2019. Data extraction was performed using the 22-item checklist for reporting of prediction model studies (www.TRIPOD-statement.org). We measured the overall adherence per article and per TRIPOD item.
Results Our search identified 24 814 articles, of which 152 articles were included: 94 (61.8%) prognostic and 58 (38.2%) diagnostic prediction model studies. Overall, articles adhered to a median of 38.7% (IQR 31.0-46.4) of TRIPOD items. No articles fully adhered to complete reporting of the abstract and very few reported the flow of participants (3.9%, 95% CI 1.8 to 8.3), appropriate title (4.6%, 95% CI 2.2 to 9.2), blinding of predictors (4.6%, 95% CI 2.2 to 9.2), model specification (5.2%, 95% CI 2.4 to 10.8), and model’s predictive performance (5.9%, 95% CI 3.1 to 10.9). There was often complete reporting of source of data (98.0%, 95% CI 94.4 to 99.3) and interpretation of the results (94.7%, 95% CI 90.0 to 97.3).
Conclusion Similar to prediction model studies developed using conventional regression-based techniques, the completeness of reporting is poor. Essential information to decide to use the model (i.e. model specification and its performance) is rarely reported. However, some items and sub-items of TRIPOD might be less suitable for ML-based prediction model studies and thus, TRIPOD requires extensions. Overall, there is an urgent need to improve the reporting quality and usability of research to avoid research waste.
What is new?
Key findings: Similar to prediction model studies developed using regression techniques, machine learning (ML)-based prediction model studies adhered poorly to the TRIPOD statement, the current standard reporting guideline.
What this adds to what is known? In addition to efforts to improve the completeness of reporting in ML-based prediction model studies, an extension of TRIPOD for these type of studies is needed.
What is the implication, what should change now? While TRIPOD-AI is under development, we urge authors to follow the recommendations of the TRIPOD statement to improve the completeness of reporting and reduce potential research waste of ML-based prediction model studies.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
GSC is funded by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre (BRC) and by Cancer Research UK program grant (C49297/A27294). PD is funded by the NIHR Oxford BRC. The views expressed are those of the authors and not necessarily those of the NHS, NIHR, or Department of Health.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethical approval is not requiered because only published data is used.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
(c.l.andaurnavarro{at}umcutrecht.nl)
(j.a.a.damen{at}umcutrecht.nl)
(t.takada-3{at}umcutrecht.nl)
(S.W.J.Nijman{at}umcutrecht.nl)
(paula.dhiman{at}ndorms.ox.ac.uk)
(jie.ma{at}csm.ox.ac.uk)
(gary.collins{at}csm.ox.ac.uk)
(r.bajpai{at}keele.ac.uk)
(r.riley{at}keele.ac.uk)
(k.g.m.moons{at}umcutrecht.nl)
(l.hooft{at}umcutrecht.nl)
Data Availability
Detailed extracted data on all included studies are available upon reasonable request.