RT Journal Article SR Electronic T1 Using explainable artificial intelligence to identify linguistic biomarkers of amyloid pathology in primary progressive aphasia JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2024.05.02.24306657 DO 10.1101/2024.05.02.24306657 A1 Robertson, Cole A1 Rezaii, Neguine A1 Hochberg, Daisy A1 Quimby, Megan A1 Wolff, Phillip A1 Dickerson, Bradford C. YR 2024 UL http://medrxiv.org/content/early/2024/05/05/2024.05.02.24306657.abstract AB Introduction Recent success has been achieved in Alzheimer’s disease (AD) clinical trials targeting amyloid beta (β), demonstrating a reduction in the rate of cognitive decline. However, testing methods for amyloid-β positivity are currently costly or invasive, motivating the development of accessible screening approaches to steer patients toward appropriate diagnostic tests. Here, we employ a pre-trained language model (Distil-RoBERTa) to identify amyloid-β positivity from a short, connected speech sample. We further use explainable AI (XAI) methods to extract interpretable linguistic features that can be employed in clinical practice.Methods We obtained language samples from 74 patients with primary progressive aphasia (PPA) across its three variants. Amyloid-β positivity was established through the analysis of cerebrospinal fluid, amyloid PET, or autopsy. 51% of the sample was amyloid-positive. We trained Distil-RoBERTa for 16 epochs with a batch size of 6 and a learning rate of 5e−5, and used the LIME algorithm to train interpretation models to interpret the trained classifier’s inference conditions.Results Over ten runs of 10-fold cross-validation, the classifier achieved a mean accuracy of 92%, SD = 0.01. Interpretation models were able to capture the classifier’s behavior well, achieving an accuracy of 97% against classifier predictions, and uncovering several novel speech patterns that may characterize amyloid-β positivity.Discussion Our work improves previous research which indicates connected speech is a useful diagnostic input for prediction of the presence of amyloid-β in patients with PPA. Further, we leverage XAI techniques to reveal novel linguistic features that can be tested in clinical practice in the appropriate subspecialty setting. Computational linguistic analysis of connected speech shows great promise as a novel assessment method in patients with AD and related disorders.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by the US National Institute on Deafness and Other Communication Disorders grants R01 DC014296 to BCD and R21 DC019567 to BCD and PW, National Institute on Aging grants R01 AG081249 to BCD and R21 AG073744 to BCD and PW, National Institute of Neurological Disorders and Stroke grant RF1 NS131395 to BCD, and Alzheimer's Association grant 23AACSF-1029880 and MGH Screening Technologies in Primary Care Innovation Fund (PCIF) 2023A063002 to NR. This research was carried out in part at the Athinoula A. Martinos Center for Biomedical Imaging at the MGH, using resources provided by the Center for Functional Neuroimaging Technologies, P41EB015896, a P41 Biotechnology Resource Grant supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB), National Institutes of Health. This work also involved the use of instrumentation supported by the NIH Shared Instrumentation Grant Program and/or High-End Instrumentation Grant Program, specifically, grant number(s) S10RR021110, S10RR023043, S10RR023401.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:All study participants provided informed consent in accordance with guidelines established by the Mass General Brigham Healthcare System Institutional Review Boards, which govern human subjects research at Mass General Hospital and specifically approved this entire study.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesData will be accessible upon requests to Dr. Bradford Dickerson at brad.dickerson{at}mgh.harvard.edu