Abstract
Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease that is complex in its onset, pattern of spread, and disease progression. The heterogeneity of ALS makes it extremely challenging to determine if a disease modifying therapy is effectively slowing progression. While accurately modeling ALS progression is critical to developing therapeutics, current computational methods fail to capture the complexity of disease progression. We aimed to robustly characterize disease progression patterns in ALS.
We obtained data from four clinical cohorts that cover more than 3,500 patients and include both observational and clinical trial studies. To determine whether there were common patterns of disease progression, we developed an approach based on a Mixture of Gaussian Processes (MoGP) to model longitudinal clinical data. Our approach automatically identifies clusters of patients who show similar disease progression patterns, modeling their average trajectory and the spread of the distribution in each cluster. Importantly, the method does not require any prior knowledge of the expected number of clusters.
The MoGP approach revealed that ALS progression, as measured using the ALS functional rating scale (ALSFRS-R) or forced vital capacity, is often non-linear with periods of stable disease preceded or followed by rapid decline. Patterns of progression in ALSFRS-R were robust to sparse data. When at least one year of longitudinal data were available, MoGP predictions were significantly more accurate than linear models, which are commonly used in clinical trials. Progression patterns were consistent across different cohorts despite differences in the frequency of data collection and the lengths of follow-up periods. We further showed that clusters identified from one large, publicly available study population could be used to stratify unseen participants in other studies. We also showed that these progression trajectories correspond with survival outcomes.
This work highlights the importance of modeling nonlinear disease progression for developing more advanced clinical trial endpoint analysis models. In ALS, sporadic, rapid decline (“functional cliffs”) and sigmoidal patterns in disease progression in untreated patients may obscure detection of therapeutic efficacy if linear models are used. We provide a pre-trained computational model of observed clinical patterns that can be used by others to analyze new ALS patient cohorts. We expect that the MoGP approach can also be applied to additional ALS outcome measures and to other progressive diseases. Our results provide a critical advance in characterizing the complex disease progression patterns of ALS.
Competing Interest Statement
Dr. Berry reports personal fees from Biogen, personal fees from Clene Nanomedicine, grants from Alexion, grants from Biogen, grants from MT Pharma of America, grants from Anelixis Therapeutics, grants from Brainstorm Cell Therapeutics, grants from Genentech, grants from nQ Medical, grants from NINDS, grants from Muscular Dystrophy Association, grants from ALS One, grants from Amylyx Therapeutics, personal fees from MT Pharma Holdings of America, grants from ALS Association, grants from ALS Finding A Cure, grants from Rapa Therapeutics, and grants from MT Pharma Holdings of America. Dr. Fraenkel reports personal fees from Seer Biosciences, personal fees from Tech U, other from Sanofi, other from ReviveMed, personal fees from Microsoft Research, personal fees from Engine Biosciences, and personal fees from UBS. Dr. Glass reports grants from Muscular dystrophy association, during the conduct of the study. Dr. Ng, Dr. Severson, Dr. Ghosh are employed by IBM Research. Dr. Fournier, Dr. Sachs, Divya Ramamoorthy report no competing interests.
Funding Statement
This study was funded by the MIT-IBM Watson AI Lab and Answer ALS. The Answer ALS organization funded the collection of the AALS dataset. The Emory ALS database is supported by a grant from the Muscular Dystrophy Association. None of the organizations had any influence on the writing of the manuscript or the decision to submit it for publication.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Explicit approval was received for all clinical datasets used in the present work. AALS is an anonymized, publicly available dataset that does not require registration to download the clinical data. We received approval for CEFT from the National Institute of Neurological Disorders and Stroke (NINDS). For the original CEFT study, institutional review board approval was obtained at each center and participants provided written informed consent before screening. We received approval for PRO-ACT from the Pooled Resource Open-Access ALS Clinical Trials Consortium. PRO-ACT is an anonymized database that includes merged datasets from multiple ALS clinical trials. It requires an application to request access, in which the user must agree to protect the security of the data. Dr. Jonathan Glass provided approval and access for using the EMORY dataset. For the original EMORY dataset, the Emory institutional review board approved the study.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
We provide the python code for the MoGP framework as well as the pre-trained reference model described here for researchers to use to generate predictions of cluster membership and trajectory function from input patient data. All code used for data processing, modeling, and figure generation can be found at: https://github.com/fraenkel-lab/mogp AALS is publicly available for download (data.answerals.org). PRO-ACT can be downloaded by request (https://nctu.partners.org/ProACT). CEFT can be downloaded from National Institute of Neurological Disorders and Stroke (NINDS) (https://www.ninds.nih.gov/Current-Research/Research-Funded-NINDS/Clinical-Research/Archived-Clinical-Research-Datasets) by request. EMORY is restricted access at this time.
https://github.com/fraenkel-lab/mogp
Abbreviations
- AALS
- Answer ALS
- ALS
- Amyotrophic Lateral Sclerosis
- ALSFRS-R
- Revised ALS Functional Rating Scale
- CEFT
- Clinical Trial of Ceftriaxone in ALS
- EMORY
- Emory ALS Clinic database
- FVC
- Forced Vital Capacity
- LKM
- Linear Kernel Model
- MoGP
- Mixture of Gaussian Processes model
- PRO-ACT
- The Pooled Resource Open-Access ALS Clinical Trials
- RMSE
- Root mean squared error
- SM
- Slope Model