Abstract
BACKGROUND Machine learning promises to support the diagnosis of dementia and Alzheimer’s Disease, but may not perform well in new settings. We present a framework to assess the transportability of models predicting cognitive impairment in external settings with different demographics.
METHODS We mapped and quantified relationships between variables associated with cognitive impairment using causal graphs, structural equation models, and data from the ADNI study. These estimates generated datasets for training and validating prediction models. We measured transportability to external settings with interventions on age, APOE ε4, and sex, using calibration metric differences.
RESULTS Models predicting with causes of the outcome were 1.3-12.8 times more transportable than those predicting with consequences. Logistic and lasso models had better calibration in internal validation settings than random forest and boosted models.
DISCUSSION Applying a framework considering causal relationships is crucial to assess transportability. Future research could investigate more interventions and methods to quantify causal relationships in risk prediction.
Research in context
Systematic Review: Machine learning models supporting the diagnosis of cognitive impairment may not perform well in external validation settings. Theoretical research established that models can be more transportable to external settings when predictors are causes of the outcome. Causal frameworks and practical examples to assess transportability are needed.
Interpretation: We developed and applied a causal framework to assess the transportability of models predicting cognitive impairment to settings with different demographics using a causal graph and interventions on semi-synthetic data. Our results add a practical example showing that models are more transportable when predicting with causes of the outcome rather than with its consequences. This supports using causal frameworks in prediction models to improve transportability.
Future directions: Our framework can be extended to include more complex semi-synthetic data generation methods to quantify causal relationships. Further applications to risk prediction models could assess transportability under different interventions that simulate complex differences between populations.
Competing Interest Statement
TK reports outside the submitted work to have received research grants from the German Joint Committee and the German Ministry of Health. He further reports personal compensation from Eli Lilly and Company, Teva, TotalEnergies S.E. and the BMJ. MP reports having received partial funding for a self-initiated research project from Novartis Pharma. MP further reports being awarded a research grant from the Center for Stroke Research Berlin (private donations) for a self-initiated project. All other authors declare no conflicts of interest.
Funding Statement
Data collection and sharing for this project was funded by the Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer's Association; Alzheimer's Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer's Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study involved only openly available human data. Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu) and at the TADPOLE grand challenge (https://tadpole.grand-challenge.org/Data/).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
† Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in the analysis or writing of this report. A complete listing of ADNI investigators can be found at: https://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf
Data Availability
All data produced are available online at