RT Journal Article SR Electronic T1 Unsupervised machine-learning identifies clinically distinct subtypes of ALS that reflect different genetic architectures and biological mechanisms JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.06.12.23291304 DO 10.1101/2023.06.12.23291304 A1 Spargo, Thomas P A1 Marriott, Heather A1 Hunt, Guy P A1 Pain, Oliver A1 Kabiljo, Renata A1 Bowles, Harry A1 Sproviero, William A1 Gillett, Alexandra C A1 Fogh, Isabella A1 Project MinE ALS Sequencing Consortium A1 Andersen, Peter M. A1 Başak, Nazli A. A1 Shaw, Pamela J. A1 Corcia, Philippe A1 Couratier, Philippe A1 de Carvalho, Mamede A1 Drory, Vivian A1 Glass, Jonathan D. A1 Gotkine, Marc A1 Hardiman, Orla A1 Landers, John E. A1 McLaughlin, Russell A1 Mora Pardina, Jesús S. A1 Morrison, Karen E. A1 Pinto, Susana A1 Povedano, Monica A1 Shaw, Christopher E. A1 Silani, Vincenzo A1 Ticozzi, Nicola A1 Damme, Philip Van A1 van den Berg, Leonard H. A1 Vourc’h, Patrick A1 Weber, Markus A1 Veldink, Jan H. A1 Dobson, Richard J.B. A1 Khleifat, Ahmad Al A1 Cummins, Nicholas A1 Stahl, Daniel A1 Al-Chalabi, Ammar A1 Iacoangeli, Alfredo YR 2023 UL http://medrxiv.org/content/early/2023/06/13/2023.06.12.23291304.abstract AB Background Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease characterised by a highly variable clinical presentation and multifaceted genetic and biological bases that translate into great patient heterogeneity. The identification of homogeneous subgroups of patients in terms of both clinical presentation and biological causes, could favour the development of effective treatments, healthcare, and clinical trials. We aimed to identify and characterise homogenous clinical subgroups of ALS, examining whether they represent underlying biological trends.Methods Latent class clustering analysis, an unsupervised machine-learning method, was used to identify homogenous subpopulations in 6,523 people with ALS from Project MinE, using widely collected ALS-related clinical variables. The clusters were validated using 7,829 independent patients from STRENGTH. We tested whether the identified subgroups were associated with biological trends in genetic variation across genes previously linked to ALS, polygenic risk scores of ALS and related neuropsychiatric traits, and in gene expression data from post-mortem motor cortex samples.Results We identified five ALS subgroups based on patterns in clinical data which were general across international datasets. Distinct genetic trends were observed for rare variants in the SOD1 and C9orf72 genes, and across genes implicated in biological processes relevant to ALS. Polygenic risk scores of ALS, schizophrenia and Parkinson’s disease were also higher in distinct clusters with respect to controls. Gene expression analysis identified different altered biological processes across clusters reflecting the genetic differences. We developed a machine learning classifier based on our model to assign subgroup membership using clinical data available at first visit, and made it available on a public webserver at http://latentclusterals.er.kcl.ac.uk.Conclusion ALS subgroups characterised by highly distinct clinical presentations were discovered and validated in two large independent international datasets. Such groups were also characterised by different underlying genetic architectures and biology. Our results showed that data-driven patient stratification into more clinically and biologically homogeneous subtypes of ALS is possible and could help develop more effective and targeted approaches to the biomedical and clinical study of ALS.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study did not receive any fundingAuthor DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll data produced in the present study are available upon reasonable request to the authors