Abstract
Background The global pandemic of COVID-19 has challenged healthcare organizations and caused numerous deaths and hospitalizations worldwide. The need for data-based decision support tools for many aspects of controlling and treating the disease is evident but has been hampered by the scarcity of real-world reliable data. Here we describe two approaches: a. the use of an existing EMR-based model for predicting complications due to influenza combined with available epidemiological data to create a model that identifies individuals at high risk to develop complications due to COVID-19 and b. a preliminary model that is trained using existing real world COVID-19 data.
Methods We have utilized the computerized data of Maccabi Healthcare Services a 2.3 million member state-mandated health organization in Israel. The age and sex matched matrix used for training the XGBoost ILI-based model included, circa 690,000 rows and 900 features. The available dataset for COVID-based model included a total 2137 SARS-CoV-2 positive individuals who were either not hospitalized (n = 1658), or hospitalized and marked as mild (n = 332), or as having moderate (n = 83) or severe (n = 64) complications.
Findings The AUC of our models and the priors on the 2137 COVID-19 patients for predicting moderate and severe complications as cases and all other as controls, the AUC for the ILI-based model was 0.852[0.824–0.879] for the COVID19-based model – 0.872[0.847–0.879].
Interpretation These models can effectively identify patients at high-risk for complication, thus allowing optimization of resources and more focused follow up and early triage these patients if once symptoms worsen.
Funding There was no funding for this study
Evidence before this study We have search PubMed for coronavirus[MeSH Major Topic] AND the following MeSH terms: risk score, predictive analytics, algorithm, predictive analytics. Only few studies were found on predictive analytics for developing COVID19 complications using real-world data. Many of the relevant works were based on self-reported information and are therefore difficult to implement at large scale and without patient or physician participation.
Added value of this study We have described two models for assessing risk of COVID-19 complications and mortality, based on EMR data. One model was derived by combining a machine-learning model for influenza-complications with epidemiological data for age and sex dependent mortality rates due to COVID-19. The other was directly derived from initial COVID-19 complications data.
Implications of all the available evidence The developed models may effectively identify patients at high-risk for developing COVID19 complications. Implementing such models into operational data systems may support COVID-19 care workflows and assist in triaging patients.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
None
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study protocol was approved by Maccabi Healthcare Services IRB
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
Data referred to the study are not avaliable due to Israel privacy regulations