Abstract
Background Parkinson’s disease (PD) is a progressive neurodegenerative disorder that affects motor control, leading to symptoms such as tremors or impaired balance. Early diagnosis of PD is crucial for effective treatment, yet traditional diagnostic models are often costly and lengthy. This study explores the use of Artificial Intelligence (AI) and Machine Learning (ML) techniques, particularly voice analysis, to identify early signs of PD and make a precise diagnosis.
Objectives This paper aims to create an automatic detection and prediction of PD binary classification using vocal biomarkers. We will also use explainability to identify latent and important patterns in the input data in retrospect to the target to inform the definition of Parkinson’s through voice characteristics. Finally, a probability generation will be generated to create a scoring system of a patient’s odds of PD as a spectrum.
Methods We utilized a dataset comprising 81 voice recordings from both healthy control (HC) and PD patients, applying a hybrid AI model combining Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Multiple Kernel Learning (MKL), and Multilayer Perceptron (MLP). The model’s architecture was designed to extract and analyze acoustic features such as Mel-Frequency Cepstral Coefficients (MFCCs), local jitter, and local shimmer, which are all indicative of PD-related voice impairments. Once features are extracted, the AI model will generate prediction labels for HC or PD files. Then, a scoring system will assign a number ranging from 0-1 to each file, indicating the stage of PD development.
Results Our champion model yielded the following results: diagnostic accuracy of 91.11%, recall of 92.50%, precision of 89.84%, an F1 score of 0.9113, and an area under curve (AUC) of 0.9125. Furthermore, the use of SHapley Additive exPlanations (SHAP) provided detailed insight into the model’s decision-making process, highlighting the most influential features contributing to a PD diagnosis. The outcomes of the implemented scoring system demonstrate a distinct separation in the probability assessments for PD across the 81 analyzed audio samples, validating our scoring system by confirming that the vocal biomarkers in the audio files accurately correspond with their assigned scores.
Conclusion This study highlights the efficacy of AI, particularly a hybrid model combining CNN, RNN, MKL, and Deep Learning in diagnosing early PD through voice analysis. The model demonstrated a robust ability to distinguish between HC and PD patients with significant accuracy by leveraging key vocal biomarkers such as MFCCs, jitter, and shimmer.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The authors have no funding to report.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The data supporting the findings of this study are openly available in Figshare at https://doi.org/10.6084/m9.figshare.238491273. These data were derived from the following resources available in the public domain: https://figshare.com/articles/dataset/Voice_Samples_for_Patients_with_Parkinson_s_Disease_and_Healthy_Controls/23849127.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
10 Data availability
The data supporting the findings of this study are openly available in Figshare at https://doi.org/10.6084/ m9.figshare.238491273. These data were derived from the following resources available in the public domain: https://figshare.com/articles/dataset/Voice_Samples_for_Patients_with_Parkinson_s_Disease_and_Healthy_Controls/23849127.