Abstract
COVID-19 is a viral disease that affects people in different ways: Most people will develop mild symptoms; others will require hospitalization, and a few others will die. Hence identifying risk factors is vital to assist physicians in the treatment decision. The objective of this paper is to determine whether unsupervised analysis of risk factors of positive and negative COVID-19 subjects may be useful for the discovery of a small set of reliable and clinically relevant risk-profiles. We selected 13367 positive and 19958 negative hospitalized patients from the Mexican Open Registry. Registry patients were described by 13 risk factors, three different outcomes, and COVID-19 test results. Hence, the dataset could be described by 6144 different risk-profiles per age group. To discover the most common risk-profiles, we propose the use of unsupervised learning. The data was split into discovery (70%) and validation (30%) sets. The discovery set was analyzed using the partition around medoids (PAM) method and robust consensus clustering was used to estimate the stable set of risk-profiles. We validated the reliability of the PAM models by predicting the risk-profile of the validation set subjects. The clinical relevance of the risk-profiles was evaluated on the validation set by characterizing the prevalence of the three patient outcomes: pneumonia diagnosis, ICU, or death. The analysis discovered six positives and five negative COVID-19 risk-profiles with strong statistical differences among them. Henceforth PAM clustering with consensus mapping is a viable method for unsupervised risk-profile discovery among subjects with critical respiratory health issues.
Competing Interest Statement
The authors have declared no competing interest.
Clinical Trial
Our study was not a clinical trial
Funding Statement
This research was supported with funding from the Mexican National Council for Science and Technology (CONACYT).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Our work did not require ethical oversight.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Paper in collection COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv
The Chan Zuckerberg Initiative, Cold Spring Harbor Laboratory, the Sergey Brin Family Foundation, California Institute of Technology, Centre National de la Recherche Scientifique, Fred Hutchinson Cancer Center, Imperial College London, Massachusetts Institute of Technology, Stanford University, University of Washington, and Vrije Universiteit Amsterdam.