Abstract
Objectives To develop a screening tool for timely referral for advanced therapy (AT) in patients with Parkinson’s disease (PD), and to compare the newly-developed tool with the published 5-2-1 criteria.
Design Cross-sectional, diagnostic, observational study and multivariable logistic regression analysis for item selection.
Setting 8 hospitals in the Northern part of the Netherlands situated in the catchment area of a specialized movement disorder centre.
Participants 259 consecutive PD patients not yet on AT visiting the outpatient clinic of participating hospitals from February 2017 to July 2018.
Predictors 24 patient and disease characteristics as assessed by the treating neurologists, and scores on the NMS questionnaire.
Outcome Apparent eligibility for referral for AT based on consensus by a panel of 5 experts in the field of AT.
Results 17 patients were deemed eligible for referral for AT (point prevalence: 6.6%). Presence of response fluctuations and troublesome dyskinesias were the strongest independent predictors of being eligible for referral. Both variables were included in the final model, as well as levodopa equivalent daily dose. Decision curve analysis showed that the new model outperformed the 5-2-1 criteria. A simple chart was constructed to provide guidance for referral. Discrimination of this simplified scoring system was good (AUC after bootstrapping: 0.97).
Conclusion The screening tool may improve efficiency of referral and subsequent treatment with AT in patients with PD. External validation is required prior to application in daily practice.
Anyone who wishes to share, reuse, remix, or adapt this material must obtain permission from the corresponding author.
INTRODUCTION
Parkinson’s disease (PD) is a common neurodegenerative disorder for which a cure is lacking.[1] Treatment is aimed at alleviating symptoms and improving quality of life through lessening of motor and non-motor symptoms using dopaminomimetic drugs.[2] Within ten years of treatment with oral or transdermal dopaminomimetics, many PD patients develop disabling motor complications, typically involving (unpredictable) response fluctuations and troublesome dyskinesias.[3] Initially, these motor complications can be circumvented by optimizing the treatment regimen, but at some point oral and transdermal treatment no longer suffices. [4]
Advanced therapies (AT) are effective treatments in patients with PD suffering from motor complications that can no longer be treated adequately using oral and transdermal dopaminergic drugs.[5] The term AT refers to three invasive treatment modalities, namely deep brain stimulation of the subthalamic nucleus (DBS), continuous subcutaneous apomorphine infusion (CSAI) and infusion of levodopa-carbidopa intestinal gel (LCIG).[6–9] All three treatments have been shown to improve motor function and increase quality of life in PD patients suffering from unpredictable response fluctuations and troublesome dyskinesias.[7–9]
Patient eligibility for AT is usually determined in specialized movement disorders centres by means of a comprehensive and standardized selection process. [10,11] Currently, however, timely referral to these specialized centres is suboptimal, as it remains challenging to identify PD patients who might be candidates for AT. Indeed, neurologists who have less experience with AT and infrequently treat patients with PD, have difficulty timing referral, i.e., not too early and not too late.[Moes et al, in submission]. This difficulty is probably due to a lack of clear and easy-to-use criteria for pre-selection.[12] This leads to both inappropriate referrals (i.e. a patient will be rejected in the specialized centre) and under-referral (i.e. a potentially eligible patient is not given the opportunity to be evaluated).[13] With regard to the suboptimal referral pattern, a group of international experts pleaded for the development of guidance for general neurologists to improve timely referral of patients for AT.[14]
Recently, multiple attempts have been made to define criteria for which patients should be referred for AT.[11,15] These attempts have led to the development of a simple screening tool, the so-called 5-2-1 screening criteria, to identify advanced PD patients in whom any AT may be considered.[14,16,17] According to these criteria patients who fulfil at least one of the following characteristics have advanced PD: 5 times oral levodopa tablets taken daily; 2 hours of off time per day; 1 hour of troublesome dyskinesia per day. Although these criteria are easy to use and possibly sensitive in diagnosing advanced PD,[16,17] the 5-2-1 criteria have not been developed nor tested using prospective data from routine care. An important limitation is the lack of data on the diagnostic accuracy of the 5-2-1 criteria, including a reliable estimation of the positive predictive value.
To circumvent the limitations of the 5-2-1 criteria, the aim of this study was to develop a new, user-friendly screening tool using data from a cross-sectional study in routine care. An additional goal was to compare the diagnostic performance of the newly developed tool with the 5-2-1 criteria.
METHODS
The TRIPOD guidelines for multivariable prediction models were followed for the development, internal validation and reporting of this clinical screening tool. [18] The Medical Ethics Review Committee of the University Medical Center Groningen (the Netherlands) waived the need for consent. All patient data was processed in accordance with current privacy legislation.
Design
This was a cross-sectional study aimed at the development of a tool to screen for PD patients eligible for referral for AT. The target population were consecutive PD patients attending regular outpatient appointments to secondary care settings. The study was conducted between February 2017 and November 2018. The clinical data was collected prospectively in the period February 2017-July 2018. The assessments by the expert panel took place in three meetings between June 2018 and November 2018. No external validation was performed.
Participants
The study was be performed in a pre-existing referral network in the northern part of the Netherlands (Parkinson Platform Noord Nederland). This is a network care setting with one university hospital (University Medical Center Groningen, offering DBS, CSAI and LCIG) and 13 referring community hospitals.
In short, characteristics of consecutive PD patients visiting the outpatient clinic were assessed and recorded by neurologists and supervised PD nurses (health care professionals; HCP) from the community hospitals. Hospitals participating in the study were instructed to include the first 50 consecutive PD patients. A panel of experts evaluated the anonymized patient data to determine whether individual patients would be good candidates for referral (reference test).
Patients and HCP
Inclusion criteria for PD patients were age > 18 years and levodopa-responsive PD according to the MDS Diagnostic Criteria.[19] Patients who had already been treated with AT were excluded. In the community hospitals, HCP were entitled to participate in the data collection if they were officially registered as a general neurologist, or as PD nurse provided that he/she was supervised by a neurologist. Neurologists who were member of the expert panel were not allowed to be involved in the initial assessment of the patients enrolled.
Expert panel
The expert panel consisted of five movement disorder specialists (neurologists) with extensive experience in treating PD patients and substantial experience in applying AT. Table A1 (appendix) shows the characteristics of the experts. One neurologist (TvL) was expert in the field of DBS, LCIG and CSAI. Two neurologists (MvK and BvH) were experts in the field of LCIG infusion. Two other neurologists (AP and TM) were experts in the field of CSAI. The experts did not receive any financial compensation for participating in this study.
Outcome and predictor variables
The screening tool was designed to predict the following outcome: eligibility for referral to a specialized centre for additional evaluation for treatment with any AT (DBS, LCIG or CSAI). In short, the outcome was determined by an expert panel (see below). The potential predictors of the model were clinical and demographic variables as recorded during routine consultations at the outpatient clinic.
Predictor variables
All included patients were asked to complete the Dutch version of the NMS-quest prior to their regular visit to the outpatient neurology clinic.[20] During the visit, the HCP assessed the clinical and demographic variables of interest and filled out the study assessment form (see table A2 in the appendix for the complete list of all potential predictors). The form contained information on demographics, motor symptoms, non-motor symptoms and the medication regimen, all considered relevant in determining eligibility for AT.[14] The assessment of these variables reflected daily practice, as all information of interest was easily accessible during routine consultations at the outpatient clinic.
In addition to filling out the clinical characteristics, the HCP had to state whether they felt the patient would be eligible for referral for AT.
Outcome
In the absence of a standardized reference test for eligibility for referral (the ‘gold standard’), the primary outcome was based on the judgment (consensus) of an expert panel. For practical reasons, we used a staged approach for decision making, as described elsewhere. [21]
The expert panel evaluated all the recorded information on the enrolled patients (see section predictor variables), while they remained blinded for the judgment of the HCP. Based on all this information, the expert panel determined whether the patient was eligible for referral for AT.
First, all experts assessed each case individually and submitted their judgement to the main investigator (HM). Second, a selection of cases was discussed during a plenary session with the experts. Cases were selected if there was disagreement among the individual experts, or disagreement between the judgment of the HCP and the expert panel (see figure A1 in appendix). During the plenary session, consensus on the selected cases was reached using voting rounds. All experts voted by raising a ballot paper (yes, no, indifferent). In case of no unanimity, the experts discussed their viewpoints. Thereafter, a second voting round was performed. In case of no consensus, a majority vote was decisive.
Sample size calculation
A common maxim for group size calculations in studies developing prediction models is that one requires at least 10 cases (from the smallest category) per predictor variable.[18,22] Our initial aim was to obtain a group size of 1000 patients in total. However, for practical reasons, we downsized this aim to 300 patients. We assumed this number to be sufficient to study 3 independent predictors, as we estimated the prevalence of AT eligibility to be 10% among all PD patients.
Data processing
The data collection and management for this research project was performed using the OpenClinica, Community Edition, version 3.14. All study data was stored in a secure location to which only authorized researchers had access. The data set from OpenClinica was exported to a csv-file to perform the statistical analyses.
Statistical analyses
The statistical analyses were performed in SPSS Statistics version 25.0, R Statistics version 4.0.2 and Microsoft Excel 2019.
Patient demographics were examined using descriptive statistics. Continuous, normally distributed variables were summarized as mean with standard deviation (SD). Categorical and ordinal variables were described as number and percentage.
The prevalence of referral eligibility for AT was determined using the Clopper-Pearson binomial test at a 95% confidence interval (95% CI). The degree of agreement between the assessing HCP and the expert panel was expressed with Cohen’s kappa.
Missing values
Missing values were not imputed in the primary, univariable logistic regression analyses. Prior to multivariable logistic regression (see modelling), missing values were imputed using multiple imputation (MICE package in R). In total, 10 imputation sets were created. Missing values were predicted on the basis of all other predictors that had at least a minimum correlation (0.2) to make sure the imputation model would converge. The outcome was excluded from the imputation model. For continuous variables, we checked that observed and imputed values were in the same range.
Modelling, predictor selection and internal validation
Associations between potential predictors and the outcome variable were studied using univariable logistic regression analyses. Continuous variables were not dichotomized. Ordinal variables were not converted into continuous variables. P-values < 0.05 were considered statistically significant. No adjustments were made for multiple testing.
Before the multiple regression modelling, small categories of ordinal predictors were grouped to eliminate sparse categories. During multiple logistic regression modelling, key predictors were identified using 50-fold cross validation and lasso GLM in order to internally validate the model and minimize the risk of overfitting. We used a conservative estimate for lambda (1 SE) to find the simplest model by means of ensemble modelling and majority voting.
Model performance, simplified scoring system and comparison with the 5-2-1 criteria
We reported the β values and odds ratios of the final multivariable logistic regression model (‘pooled’ estimate of the multiple imputed dataset). For the variable levodopa equivalent daily dose (LEDD)[23], the β values and odds ratios were reported with increments of 100 mg/day. To create an easy-to-use screening tool, we made a simplified version of the model by rounding the regression coefficients to easy-to-sum integers.
A calibration plot was constructed to evaluate the calibration of the final model. We assessed the predictive performance of the simplified screening tool by examining the receiver operating characteristics (ROC) curve and its area under the curve (AUC) in the multiple imputed dataset. We reported a bootstrapped estimate of the AUC of the simplified scoring system. For bootstrapping, 1000 samples were drawn with replacement of the original sample.
We used decision curve analysis (DCA) compare the clinical utility of the final model with 5-2-1 criteria.[24] In short, DCA compares the net benefit of two models for a range of threshold values. At any given threshold, the model with the higher net benefit is the preferred model. To calculate the numbers for the 5-2-1 criteria, we used the imputed dataset and we determined that «at least 1 hour of troublesome dyskinesias» was positive if the patient had 0-2 hours of troublesome dyskinesias per day (on the study assessment form). Additional information on DCA, including a step-by-step guide for interpreting decision curves, can be found at www.decisioncurveanalysis.org.[25,26]
Screening tool
For user-friendliness we created a graphical presentation of the simplified scoring system. To illustrate the possible application of the prototype screening tool, we obtained an arbitrary, non-empirical threshold value by determining the coordinates of the ROC-curve at which the sum of sensitivity and specificity was highest. In addition, we calculated the positive predictive value and negative predictive value.
RESULTS
Included cases
In total, 263 assessment forms and NMS questionnaires were filled out in eight different hospitals (figure 1). Of these, 259 cases were suitable for assessment by the expert panel.
The HCP deemed 24 patients eligible for referral for AT, while 12 of these patients did not want to be referred for AT (figure A2 in appendix). The expert panel considered 17 cases to be eligible for referral for AT; the remaining 242 patients were not considered eligible for referral (figure A3). Based on these numbers, the prevalence of referral eligibility for AT was 6.6% (95% CI: 3.9-10.3%). The agreement between the assessment of the referring HCP and the expert panel was moderate (Cohen’s Kappa: 0.44, table A3 in appendix).
Patient characteristics
The patient and disease characteristics of the included cases are shown in table 1. The mean age of all patients was 68.0 years (SD: 9.6). The majority of the patients were male (65.9%). The mean disease duration was 5.0 years (SD: 4.5), with the disease duration being significantly higher in the group of patients eligible for referral for AT (mean duration of disease: 11.5 years; SD: 3.1).
Motor symptoms of advanced PD were more common in patients eligible for referral for AT. However, these motor symptoms were absent in the majority of all patients: 76.9% had no response fluctuations, 88.4% experienced no treatment-resistant tremor, 90.2% did not suffer from troublesome dyskinesias, and 83.5% did not experience off periods.
Model development and model specification
Table 2 shows the uncorrected, univariable association between each candidate predictor and eligibility for referral for AT. The number of complete cases and missing values per candidate predictor are also shown. Table 3 presents the final multivariable logistic regression model for eligibility for referral for AT, including the variables LEDD, response fluctuations and troublesome. Table 4 shows the simplified scoring system for determining eligibility for referral for AT.
Performance of the model and screening tool
The calibration slope of the model (figure 2) showed moderate calibration in-the-large, with a tendency to overestimation. Decision curve analysis (figure 3) showed that the model had higher net benefit than the 5-2-1-criteria for all possible threshold values. The 5-2-1 criteria had a comparable net benefit for lower threshold values, i.e. when the ‘costs’ of false-negative are approximately 50 times higher than the ‘costs’ of a false-positive (threshold value: 2% [1/0.02 = 50])). The simplified scoring system (screening tool) had a good discriminative performance (ROC area: 0.97; 95% CI: 0.94-0.99; bootstrap bias: 0.0003; figure 4).
Based on our arbitrary definition, the optimal cut-off point of the screening tool was 6.42. Using this cut-off point, the screening tool had a higher positive predictive value than the 5-2-1 criteria (76 vs. 20% respectively; table 5). Both models had a comparable sensitivity (88 vs 94%), while the screening tool had a higher specificity than the 5-2-1 criteria (98 vs 73%).
DISCUSSION
This study had a number of important results. First, the prevalence of eligibility for referral for AT was relatively low, being 6.6% in a population of consecutive PD patients in a secondary care setting. Second, this study showed several strong predictors of eligibility for referral for AT, such as the presence of response fluctuations and troublesome dyskinesias. Finally, the results enabled the development of a three-factor screening tool (LEDD, response fluctuations, troublesome dyskinesias). This tool outperformed the 5-2-1 criteria.
The primary aim of this study was to develop a screening tool that helps in determining whether a PD patient should be referred to an specialized clinic for considering treatment with AT. This goal was not revolutionary, as the first attempts to improve the referral for AT go back to 2004, when the FLASQ-PD was designed to screen for potential DBS candidates.[27] Since then, several studies have been published to provide guidance to general neurologists to determine which PD patient might benefit from AT, either using clinical criteria or using wearable sensors.[5,13–15,28–30] However, most papers were focused on referral criteria for DBS only. In addition, referral criteria were based on expert consensus or assessment of fictitious cases by a group of experts.[10,11] The present study, however, used a different approach, as it investigated eligibility for referral for at least one AT by studying an unselected group of consecutive PD patients in routine care. An advantage of this methodology was the possibility to estimate the prevalence of referral eligibility, and thus to obtain a reliable estimation of the positive predictive value.
During the data collection of this study, Antonini et al. presented a consensus statement of international experts that eventually led to the development of the so-called 5-2-1 criteria.[14,31] The 5-2-1 criteria were introduced to help general neurologists identify advanced PD patients who are possibly eligible for AT.[17] These criteria are easy to use and the name itself serves as a mnemonic device. Our study data showed that the 5-2-1 criteria have an almost perfect sensitivity, but a considerable number of false positives.
In the present study, we developed a new model using empirical data and adequate statistical techniques. The variables in our final model partly corroborate the items included in the 5-2-1 criteria. Both the 5-2-1 criteria and our model include presence of dyskinesias and a measure of medication use (frequency of drugs vs. LEDD, respectively). However, the 5-2-1 includes «at least 2 hours off time per day» while this variable was excluded from our model during statistical item selection. Instead, our model includes presence of response fluctuations, which implicitly refers to presence of off periods.
Despite the similarities between the two models, decision curve analysis (DCA) showed that our screening tool had a higher net benefit than the 5-2-1 criteria for all possible threshold values. Nevertheless, both the screening tool and the 5-2-1 criteria had similar testing characteristics for low threshold values of the screening tool. The threshold value of DCA is informative of how one weighs the relative ‘costs’ of false-positive and false-negative results (e.g. a threshold of 5% is similar to saying “not referring a patient (false negative) is 19 times worse than referring a patient unnecessarily”). [24,26] Thus, both models are comparable when the costs of false negatives are considerably lower than the costs of inappropriate referrals. But in the absence of empirical data on the clinical utility of referring patients for AT, it remains unknown how one should weigh such costs. Therefore, it is not yet possible to decide on the optimal threshold value for referring patients.
For the purpose of illustration, we arbitrarily chose a cut-off point for our screening tool by maximising the sum of sensitivity and specificity. This resulted in a positive predictive value of 76% but an estimated sensitivity of 88%, being slightly lower that the 5-2-1 criteria. However, using the model is an iterative process. Importantly, during a subsequent check-up with slight progression of disease the patients initially ‘overlooked’ using the screening tool may well have become eligible and still being in time for subsequent AT.
This study had some limitations. First, the reference test, for which we used an expert consensus, was suboptimal. The experts had to judge eligibility for referral based on the information from the study assessments forms, as filled out by the general neurologist. Possibly, the experts would judge differently if they could have examined the patients themselves. However, as the outcome of this study was eligibility for referral, a fair judgement is based on referral information only. Another limitation is that the model has not yet been validated in a different population. Currently, our research team is about to conduct a validation study in another region in the Netherlands. Validation of the screening tool is essential prior to application in daily practice.
Future studies do not only need to validate the newly-developed screening tool, but also need to determine whether patients will actually benefit from the screening tool. In addition, future studies should include patient’s expectations about the treatment options, as it was recently shown that patient’s expectation are a good predictor of eligibility for DBS.[32] Moreover, new insights on the timing of AT, for example the expected results from the EARLY-PUMP trial studying earlier application of CSAI,[33] might require revision of the developed screening tool and other selection criteria.
In conclusion, we developed a prototype screening tool to screen for PD patients being eligible for referral for AT. Our tool appeared to outperform the 5-2-1 criteria. We expect that, after external validation of the model, implementation of the new screening tool will contribute to better and more timely recognition of patients who are eligible for referral for AT. This will lead to a more efficient referral policy, as more patients will be referred who were otherwise not recognized, while the number of inappropriate referrals will decrease.
Data Availability
Data are available upon reasonable request.
SUPPLEMENTARY INFORMATION
Tables and figures referenced in the main text are listed in the appendix. The TRIPOD checklist is also included in the appendix.
FUNDING
This study was funded by the Dutch Parkinson Patient ‘s Association (Parkinson Vereniging).
TRIPOD Checklist: Prediction Model Development and Validation
ACKNOWLEDGMENTS
The authors thank all patients, general neurologists and Parkinson nurses who participated in this study.
In addition, the authors thank Wim Simons and Martin van Schijndel from the Dutch Parkinson Patient ‘s Association for their critical reflections on this design and results of this study and for their feedback on an earlier version of this manuscript.
Thanks to Paulus Bax for his valuable comments on the statistical analyses. We thank Claudia B. Gremmer for digitalizing and abstracting the study documents.
Footnotes
Anyone who wishes to share, reuse, remix, or adapt this material must obtain permission from the corresponding author.