Abstract
Background While numerous studies have identified factors associated with severe COVID-19 outcomes, they have yet to quantify these characteristics. Therefore, our study’s purpose is to stratify these risk factors and use them to predict outcomes.
Study Design This is a retrospective review of the CDC COVID-19 Surveillance Data. Logistic regression models calculated risk estimates for independent variables, and random forest models predicted the chance of severe outcomes.
Results Our sample of 3,798,261 patients with COVID-19 consisted mainly of females (51.9%), 10-to 69-year-olds, and White/Non-Hispanics (34.9%). Most were not healthcare workers (90.6%) and did not have preexisting medical conditions (47.1%). Age had an increased risk of severe outcomes that grew every decade of life. White patients had a decreased occurrence of severe outcomes than Non-Whites, except for Pacific Islanders with comparable mortality. The variable selection algorithm detected that three outcomes were more accurate without healthcare worker classification: mechanical ventilation/intubation, pneumonia, and ARDS Acute respiratory distress. However, providers had a decreased risk of severe outcomes overall. Also, patients with preexisting conditions demonstrated an increased risk in all outcomes. Compared to the logistic regressions, the predictive models had a higher performance (AUC>0.8). The death model had the best metrics, followed by hospitalization and ventilation. We amassed these predictive models into the Severe COVID-19 Calculator web application that estimates the probability of severe outcomes.
Conclusions Several patient social and medical demographics recorded by the CDC significantly affect severe COVID-19 outcomes suggesting a multifactorial influence. To account for these variables, a generated Severe Covid-19 Calculator can accurately predict the chance of severe outcomes in citizens that may contract or have COVID-19.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The author(s) received no specific funding for this work.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The oversight body that provided approval for the attached study was from the Centers for Disease Control and Prevention (CDC). After completion of the CDC COVID-19 Case Surveillance Restricted Access Detailed Data Registration Information and Data USe Restrictions Agreement (RIDURA) on Jan 13, 2021, we received an invitation to access the repository "cdc-data/covid_case_restricted_detailed" formally known as the COVID-19 Case Surveillance Restricted Use Detailed Data on GitHub from @Npa6 from the CDC. Contact information for those who would like to access this data or have questions about this repository [AskSRRG Mailbox](mailto:eocevent394@cdc.gov). Additionally, we received retrospective approval for collecting the original data from the Institutional Review Board of the Guthrie Clinic, confirming that this data complies with the Office of Human Research Protections (OHRP) and the policies of the Institutional Review Board of The Guthrie Clinic (See attached supplemental letter from Michael Georgetson, Chairman, Institutional Review Board).
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
No - some restrictions will apply. The data used to generate this manuscript can be found from the CDC Website. As mentioned in our methods, the data is available as "COVID-19 Case Surveillance Restricted Access."
https://data.cdc.gov/Case-Surveillance/COVID-19-CaseSurveillance-Restricted-Access-Detai/mbd7-r32t