Abstract
In March 2020, the World Health Organization declared a world pandemic of COVID-19, which can manifest in humans as a consequence of virus infection of SARS-CoV-2. On this context, this work uses Data Mining and Machine Learning techniques for the infection diagnosis. A methodology was created to facilitate this task and can be applied in any outbreak or pandemic wave. Besides generating diagnosis models based only on signals and symptoms, the method can evaluate if there are differences in signals and symptoms between waves (or outbreaks) through explainable techniques of the machine learning models. Another aspect is identifying possible quality differences between exams, for example, Rapid Test (RT) and Reverse Transcription–Polymerase Chain Reaction (RT-PCR). The case study in this work is based on data from patients who sought care at Piquet Carneiro Polyclinic of the State University of Rio de Janeiro. In this work, the results obtained with the tests were used to diagnose symptomatic infection of the SARS-CoV-2 virus, based on related signals and symptoms, and the date of the initial of these signals and symptoms. Using the Random Forrest model, it was possible to achieve the result of up to 76% sensitivity, 86% specificity, and 79% accuracy in the results of tests in one contagion wave of the SARS-CoV-2 virus. Moreover, differences were found in signals and symptoms between contagion waves, in addition to the observation that exams RT-PCR and RT Antigen tests are more reliable than RT antibody test.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
Yes
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study was conducted in accordance with ethical principles outlined in the Declaration of Helsinki and was approved by the Pedro Ernesto University Hospital Ethical Committee (CAAE: 30135320.0.0000.5259).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
† Deceased
¶ Membership list can be found in the Acknowledgments section.
1 Source: Municipal Health Secretariat of Rio de Janeiro - https://experience.arcgis.com/experience/38efc69787a346959c931568bd9e2cc4 - 11/17/2021
Data Availability
All files are available from the OSF.io database. https://osf.io/d59g8/?view_only=6709b1471fe64cab8bc3208fec1e84b8