Abstract
Purpose The coronavirus disease 2019 (COVID-19) has caused a crisis worldwide. Amounts of efforts have been made to prevent and control COVID-19’s transmission, from early screenings to vaccinations and treatments. Recently, due to the spring up of many automatic disease recognition applications based on machine listening techniques, it would be fast and cheap to detect COVID-19 from recordings of cough, a key symptom of COVID-19. To date, knowledge on the acoustic characteristics of COVID-19 cough sounds is limited, but would be essential for structuring effective and robust machine learning models. The present study aims to explore acoustic features for distinguishing COVID-19 positive individuals from COVID-19 negative ones based on their cough sounds.
Methods With the theory of computational paralinguistics, we analyse the acoustic correlates of COVID-19 cough sounds based on the COMPARE feature set, i. e., a standardised set of 6,373 acoustic higher-level features. Furthermore, we train automatic COVID-19 detection models with machine learning methods and explore the latent features by evaluating the contribution of all features to the COVID-19 status predictions.
Results The experimental results demonstrate that a set of acoustic parameters of cough sounds, e. g., statistical functionals of the root mean square energy and Mel-frequency cepstral coefficients, are relevant for the differentiation between COVID-19 positive and COVID-19 negative cough samples. Our automatic COVID-19 detection model performs significantly above chance level, i. e., at an unweighted average recall (UAR) of 0.632, on a data set consisting of 1,411 cough samples (COVID-19 positive/negative: 210/1,201).
Conclusions Based on the acoustic correlates analysis on the COMPARE feature set and the feature analysis in the effective COVID-19 detection model, we find that the machine learning method to a certain extent relies on acoustic features showing higher effects in conventional group difference testing.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work is supported by the European Union's Horizon 2020 research and innovation programme under Marie Sklodowska-Curie Actions Initial Training Network European Training Network project TAPAS (grant number 766287), the Deutsche Forschungsgemeinschaft's Reinhart Koselleck project AUDI0NOMOUS (grant number 442218748), and the Federal Ministry of Education and Research (BMBF), Germany, under the project LeibnizKILabor (grant No. 01DD20003).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The authors want to express their gratitude to the holders of the COUGHVID crowdsourcing dataset for providing collected data for research purposes. The datasets (Orlandic et al., 2021a, 2021b) generated during and/or analyzed during the current study are available in the Zenodo repository, https://zenodo.org/record/4498364#.YekMM-rMLD4. We as authors of the present study have neither seen the original ethics committee approval nor the participant consent forms.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
The datasets (Orlandic et al., 2021a, 2021b) generated during and/or analyzed during the current study are available in the Zenodo repository, https://zenodo.org/record/4498364#.YekMM-rMLD4