Clinical predictors of COVID-19 mortality

Arjun S. Yadaw; Yan-chak Li; Sonali Bose; Ravi Iyengar; Supinda Bunyavanich; Gaurav Pandey

doi:10.1101/2020.05.19.20103036

Abstract

Background The coronavirus disease 2019 (COVID-19) pandemic has affected over millions of individuals and caused hundreds of thousands of deaths worldwide.

It can be difficult to accurately predict mortality among COVID-19 patients presenting with a spectrum of complications, hindering the prognostication and management of the disease.

Methods We applied machine learning techniques to clinical data from a large cohort of 5,051 COVID-19 patients treated at the Mount Sinai Health System in New

York City, the global COVID-19 epicenter, to predict mortality. Predictors were designed to classify patients into Deceased or Alive mortality classes and were evaluated in terms of the area under the receiver operating characteristic (ROC) curve (AUC score).

Findings Using a development cohort (n=3,841) and a systematic machine learning framework, we identified a COVID-19 mortality predictor that demonstrated high accuracy (AUC=0·91) when applied to test sets of retrospective (n= 961) and prospective (n=249) patients. This mortality predictor was based on five clinical features: age, minimum O₂ saturation during encounter, type of patient encounter (inpatient vs. various types of outpatient and telehealth encounters), hydroxychloroquine use, and maximum body temperature.

Interpretation An accurate and parsimonious COVID-19 mortality predictor based on five features may have utility in clinical settings to guide the management and prognostication of patients affected by this disease.

Funding This work was funded by the National Institutes of Health.

Introduction

The coronavirus disease 2019 (COVID-19) pandemic has affected over 3.6 million individuals, and caused over 250,000 deaths worldwide as of May 5^th, 2020.(1) Although the causative SARS-CoV-2 virus primarily targets the respiratory system(2, 3), complications in other organ systems, e.g., cardiovascular, neurologic and renal, can also contribute to death from the disease. Clinical experience thus far has demonstrated significant heterogeneity in the trajectory of SARS-CoV-2 infection, spanning patients who are asymptomatic to those with mild, moderate, and severe disease forms, with a high percentage of patients who do not survive(2, 3). Notably, it can be difficult to accurately predict clinical outcomes for patients across this spectrum of clinical presentations. This presents an enormous challenge to the prognostication and management of COVID-19 patients, especially within disease epicenters such as New York City (NYC) that need to triage a high volume of patients. Accurate prediction of COVID-19 mortality, and the identification of contributing factors would therefore allow for targeted strategies in patients with the highest risk of death.

Towards this aim, we analyzed clinical data from 5051 patients who had laboratory confirmed COVID-19 and were treated within multiple hospitals and locations of the Mount Sinai Health System spanning different boroughs of NYC. We used multiple machine learning-based classification algorithms(4) to develop models that can accurately predict mortality from COVID-19. We also identified clinical features that contributed the most to this prediction. An improved understanding of predictive factors for COVID-19 is critical to the development of clinical decision support systems that can better identify those with higher risk of mortality, and inform interventions to reduce the risk of death.

Methods

Study population

De-identified electronic medical record (EMR) data from patients diagnosed with COVID-19 within the Mount Sinai Hospital System, New York, NY through April 7, 2020 were included in the study. The Mount Sinai Health System is a network of 5 hospital campuses and over 400 ambulatory practices spanning the New York metropolitan area (Supplementary Table 1). COVID-19 diagnosis was based on positive polymerase chain reaction (PCR)-based clinical laboratory testing for the SARS-CoV-2 virus.

Data from COVID-19 patients through April 6, 2020 were randomly split into two groups of independent subjects comprising 80% of the sample (n=3841) for development of the mortality predictor (i.e. development set), and 20% (n=961) to serve as retrospective test set 1. A prospective validation set of independent subjects, test set 2, included COVID-19 patients encountered on April 7, 2020 (n=249).

Identification and validation of the predictor

We implemented a systematic machine learning-based framework to construct the mortality predictor from the development set using missing value imputation(5), feature selection(6), classification(4) and statistical(7) techniques. The goal of this predictor was to classify a COVID-19 patient as likely to survive or die from the disease, i.e., “Alive” or “Deceased,” respectively. The identified predictor was then validated in test sets 1 and 2 in terms of the Area Under the ROC Curve (AUC score)(8). The overall workflow is shown in Figure 1, and detailed methods are provided in Supplementary Material.

Figure 1: Workflow for data management and COVID-19 mortality predictor development.

Data were obtained from the Mount Sinai Data Warehouse (MSDW). After pre-processing, COVID-19-positive patients’ data (n=4802) were randomly divided in an 80:20 ratio into a predictor development (n=3841) and an independent retrospective validation dataset (test set 1; n=961). For predictor training and selection, the development set was further randomly split into a 60% training dataset (n=2880) and a 20% holdout dataset (n=961). Four classification algorithms (logistic regression (LR), random forest (RF), support vector machine(SVM) and eXtreme Gradient Boosting (XGBoost)) were evaluated. The final predictive model was validated on test set 1 and another independent prospective validation set (test set 2; n=249). The complete details of the computational methods used can be found in Supplementary Material.

Role of the funding source

The funding organizations had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Results

Patient characteristics

The demographic and clinical characteristics of the COVID-19 patients included in the development set (n=3841), test set 1 (n=961) and test set 2 (n=249) are shown in Table 1. The majority (55·3%) of patients in the development set were male, with an even higher prevalence of male sex among the deceased (61·3%). COVID-19 patients were mostly Caucasian (25·3%), African American (26·2%) and Latino(24·3%), with a minority identifying as Asian (4·2%). Hypertension and diabetes were the most common comorbidities (22·6% and 15·8%%, respectively). While a small minority were obese (6·0%) or had cancer (5·4%), an even smaller percentage had asthma (4·2%), COPD (2·3%), or currently smoked (3·5%). Over a third of the patients had been treated with azithromycin and/or hydroxychloroquine, consistent with the health system’s treatment practices during this time period.

View this table:

Table 1: Characteristics of patients in the development and test sets.

%Number of patients in each class in the corresponding set is shown in parentheses below the name of the class.

*P(-values) from student’s t-test for continuous features and chi-square test for categorical features.

Univariate analyses of patient characteristics in the development set (Table 1) showed that COVID-19 patients who died were significantly older with a mean age of 73·4 (SD 12·7) vs. 54·7 (SD 18·7) years in survivors (P<0·001). They were more likely to have had their initial encounter at a hospital rather than at an outpatient or telehealth setting within our hospital system (P<0·001). Those who died had higher body temperature and lower oxygen saturation at initial presentation, and their minimum oxygen (O₂) saturation over the duration of their encounter was also lower (P<0·001 for all). Death from COVID-19 was associated with smoking (P=0·05), COPD (P<0·001), hypertension (P<0·001), and diabetes (P<0·001).

The characteristics of test sets 1 and 2 were largely similar to those of the development set, except for some differences in the relative proportions of race(Table 1). While minimum O₂ saturation during encounter was consistently lower in the deceased vs. alive patients in both test sets, O₂ saturation at presentation was lower among the deceased in test set 1 only. COPD, hypertension, and diabetes were more prevalent among the deceased in test set 1, but there were no significant differences in these comorbidities in test set 2.

Development and validation of the predictor

Following imputation, there were twenty distinct clinical features with less than 20% missing values in the development set that improved predictor performance (Figure 2A). Compared to the other classification algorithms (LR, RF, SVM), XGBoost performed significantly better at this and higher levels of missing values(Figure 2A; Friedman-Nemenyi P<0·001). Therefore, we used the imputed version of the development set with 20 features and XGBoost, to develop the first COVID-19 mortality predictor in this study, referred to as the 20F model.

Figure 2: Results from missing value imputation and feature selection during predictor training and selection.

(A) We attempted to find the optimal level of missing values in the range of 0% to 60% that could be imputed and lead to more accurate predictors. For this, we took incremental steps of 5% in missing value levels, and used mean and mode imputation for continuous and categorical features respectively. At each level, four candidate classifiers (LR, RF, SVM and XGBoost) were trained and evaluated on the corresponding holdout set in terms of the area under the receiver operating characteristic (ROC) curve (AUC score) as the metric. This process was repeated 100 times and the average AUCs for each candidate classifier and missing value level are shown here, along with error bars denoted by vertical arrows. (B) Using a setup analogous to (A), and the Recursive Feature Elimination(RFE) algorithm, we evaluated the performance of the four classification algorithms with different number of features selected from the full set of twenty. The average AUC scores from 100 runs of this process are shown here, along with error bars denoted by vertical arrows. The details of the computational methods underlying these analyses are provided in Supplementary Material. LR=logistic regression; RF=random forest; SVM=support vector machine.

We also tested if a smaller subset of the 20 features could yield an even more accurate predictor, since such a subset would be easier to study and implement in a clinical setting. Indeed, we found that for the best-performing XGBoost algorithm(Friedman-Nemenyi P<0·001), the AUC saturated at as few as five features (Figure 2B), validating our hypothesis that fewer than 20 features could yield an accurate predictor. The five features identified from the development set included the following: minimum oxygen saturation recorded during the encounter, patient’s age, type of encounter, maximum body temperature during the encounter, and use of hydroxychloroquine during treatment. We trained this second Covid-19 mortality predictor, referred to as the 5F model, by applying XGBoost to these 5 features in the imputed development set.

Validation of the 20F and 5F models on test set 1 (retrospective data) and test set 2 (prospective data) both yielded very good performance (AUC>0·9; Figure 3). The predictor’s strong performance in both test sets demonstrated that the predictors constructed from data on a given day can be reliably applied retrospectively and prospectively.

Figure 3: Performance of the final mortality predictors on two validation sets.

Based on the results in Figure 2, we constructed two predictors: (1) Training XGBoost, the best performer in Figure 2(A), on 20 features with at most 20% missing values (20F model), and (2) Training XGBoost, the best performer in Figure 2(B), on the optimal 5 features at which the performance saturated (5F model). Both these predictors were evaluated on the (A) retrospective Test set 1 (n=961) and the(B) prospective Test set 2 (n=249). Evaluation results are shown here in terms of the ROC curves obtained, as well as their area under the curve (AUC) scores. The 95% confidence intervals of the AUC scores are shown in square brackets.

Features predictive of mortality

Similar to the features that the 5F model was based on, we identified the five most predictive features for the other classification algorithms we tested (Figure 4A). While there was variability among these features due to the inherent differences among the algorithms, the age of the patient, and their minimum oxygen saturation level during the clinical encounter (O₂SAT_min) were consistent across the algorithms. The values of O₂SAT_min and age were indeed significantly different between the Deceased and Alive classes (Table 1, Figures 4B and 4C respectively; T-test P<0.001 for both features), affirming their predictive power. Supplementary Figure 1 shows that the top five features are consistent across all three runs of the feature selection and predictor development process.

Figure 4: Top predictive features selected for the four classification algorithms tested.

(A) Top five predictive features identified using the RFE algorithm for the four classification algorithms across three independent sets of hundred runs each of the predictor training and selection process described in Figures 1 and 2, and Supplementary Material. The values in parentheses indicate the number of times the feature was selected as top ranked. Also shown are violin plots representing the distributions of the values of the (B) O2SAT_min and (C) age features that were selected as top predictive features for all the four algorithms. The plot in (B) shows that the median value of O2SAT_min for the deceased group (79) was significantly lower (T-test 0.001) than that for the live group (92). Similarly, the plot in (C) shows that the median age in the deceased group (75) is significantly higher (T-test 0.001) than that in the alive group (56).

Discussion

In this study, machine learning algorithms were applied to clinical and demographic data from 3841 COVID-19 patients from a major New York metropolitan area health system to identify and test a mortality predictor that demonstrated high accuracy(AUC=0·91) when applied to test sets of retrospective (n=961) and prospective(n=249) patient data. This mortality predictor was based on five clinical features: age, minimum O₂ saturation, type of patient encounter (inpatient vs. outpatient and telehealth encounters), hydroxychloroquine use, and maximum body temperature. Given the heterogeneity in clinical presentation and course observed among COVID-19 patients,(2, 3) factors that contribute most to mortality are not always readily apparent, rendering care and management of COVID-19 patients difficult in settings of finite health care resources. Our data-driven findings may help clinicians better recognize and prioritize the care of patients at greatest risk of death.

A major strength of this study is that it was based on recent data from thousands of COVID-19 patients encountered within a global disease epicenter (NYC), resulting in findings that are highly relevant to the current pandemic. The results are based on rigorous machine learning analyses powered by a robust sample of patients with laboratory-confirmed COVID and demonstrate the potential of these methods to identify factors predicting mortality within clinical settings. Application of machine learning enabled the identification of predictors based on the XGBoost algorithm(9), where the clinical features contributed to mortality in a non-linear fashion. These predictors performed with high accuracy (AUC=0·91–0·95) in two independent validation sets of COVID-19 patients. Furthermore, the 5F model based on only the five features listed above performed almost as well as the 20F model based on all the features. This indicates that accurate mortality predictions can be obtained from a more parsimonious model, facilitating more efficient implementation in clinical environments.

Age and minimum oxygen saturation during encounter (O₂SAT_min) were the most predictive features not only for the XGBoost algorithm, but for all four mortality classifiers tested (Figure 4), emphasizing these features’ epidemiological and clinical relevance. Since the beginnings of this pandemic, older age has been recognized as a risk factor(10, 11). In New York State, patients 60 years and over represent nearly 85% of all deaths due to COVID-19(12), and similarly high rates of mortality among those of advanced age have been noted in other COVID-19 hotspots across the United States(13). In addition, the fundamental clinical presentation of COVID-19 patients across the pandemic has been respiratory symptoms associated with hypoxia, often leading to subsequent respiratory failure and requiring ventilator support and/or extracorporeal membrane oxygenation(14). This study’s finding that a patient’s minimum oxygen saturation (O₂SAT_min) value during hospitalization was the strongest predictive feature of mortality (Figure 4) is in line with global epidemiologic observations that respiratory failure is the most common feature of critical illness and death in COVID-10 patients(15, 16).

In addition to age and oxygen saturation, other features in the mortality predictor are also consistent with clinical observations accumulated from the pandemic experience to date. For example, the maximum body temperature achieved during hospitalization (TEMP_max) was a top-ranked feature common to the XGBoost and random forest-based mortality predictors (Figure 4A). While fever is a common symptom and sign of COVID-19,(2, 17, 18) patients may not always present with elevated temperatures, and fever frequently develops later during the course of hospitalization(2, 19). Consistent with this, these mortality predictors identified TEMP_max, rather than body temperature at presentation, as a top classifying feature. Similarly, health care encounter type (inpatient vs. outpatient and telehealth), was identified as a top-performing XGBoost mortality predictor, reflecting the fact that COVID-19 patients with more severe symptoms are more likely to have their initial encounter in the hospital rather than at an outpatient setting as their first point of contact. Finally, the identification of hydroxycholoroquine therapy as a top mortality predictor reflects a practice pattern specific to encounters within our hospital system based on institutional guidelines provided during a time of limited and evolving knowledge about COVID-19 treatment(20, 21). Specifically, patients in our hospital system with moderate or severe disease were often placed on hydroxycholoroquine in the absence of overt contraindications to this therapy.

Several other studies have also investigated factors affecting mortality from COVID-19. Some studies conducted statistical association analyses of individual patient characteristics and risk factors with mortality, albeit on small cohorts (<200 patients)(22–25). Another small cohort study used linear feature selection and predictor development methods to identify severe COVID-19 cases, achieving an AUC of 0·853 in a validation cohort(26). Some other studies have started leveraging clinical data from larger cohorts of several hundred patients to predict mortality and other COVID-19 outcomes(27). A relative strength of this study is that it employed a very large patient cohort and systematic combinations of machine learning methods to yield a more accurate and informative mortality predictor.

Machine learning-based methods are designed to sift through large amounts of structured and/or unstructured data to discover actionable knowledge without bias from biomedical hypothesis.(4, 28) In this study, we utilized this power of machine learning, especially those for feature selection(6) and classification⁷, to develop accurate and parsimonious predictors of mortality from COVID-19 from structured clinical and demographic data. In particular, we found that the XGBoost(9) produced the best-performing predictors in all our analyses. XGBoost is a sophisticated prediction algorithm that builds an ensemble of decision trees by iteratively focusing on harder to predict subsets of the training data. Due to its systematic optimization-based design, this algorithm has shown superior performance predictive modeling applications involving structured data(29, 30), which is consistent with our observations.

Limitations of the study

Although our datasets likely are the largest that have been used to predict COVID-19 mortality, the clinical features available to us were limited to those routinely collected during hospital encounters. Although we were able to develop accurate predictors from these limited data using our machine learning framework, it should be possible to develop even better predictors using a richer set of features. A key limitation of clinical indices included in the datasets include the uniformity of Electronic Medical Record (EMR)-derived data. For example, while minimum oxygen saturation during the health encounter was identified as a significant predictor for mortality, limitations inherent in the interpretation of this data must be noted, such as the unavailability of the amount of supplemental oxygen being administered at the time of recording and acquisition-related limitations, such as readings below the threshold of accuracy of the monitoring device (e.g. less than 70%). Nonetheless, we found a clearly lower distribution of minimum oxygen saturations in those patients who died from COVID-19 compared to those who survived, highlighting this clinical feature as central to predicting morality for infected patients.

Conclusion

Applying machine learning approaches to data from a large cohort of COVID-19 patients resulted in the identification of accurate and parsimonious predictors of mortality. These data-driven findings may help clinicians better recognize and prioritize the care of patients at greatest risk of death.

Data Availability

Due to IRB restrictions, data aren't publicly available.

Declaration of interests

The authors declare no conflicts of interest.

Ethics statement

Per the research team, all relevant institutional guidelines for ethics and human subjects research specified by the Mount Sinai IRB have been followed. All the data used for these analyses had been deidentified by the Mount Sinai Data Warehouse, and made available to all Mount Sinai researchers who have undergone training in human subjects research. Given that our study uses these data that cannot be linked to specific individuals either directly or indirectly, and were not collected specifically for the currently proposed research project through any interaction with the patients, this project is considered not human subjects research.

Funding

Yadaw and Iyengar receive funding from the National Institutes of Health U54 HG008098 and P50 GM071558. Bunyavanich receives funding from the National Institutes of Health R01 AI118833, R01 AI147028, and U19 AI136053. Bose receives funding from the National Institutions of Health, R01 HL147328 and UG3 OD023337.

Non-author contributions

This work was supported in part through the computational and data resources and staff expertise provided by Scientific Computing at the Icahn School of Medicine at Mount Sinai. We thank Sharon Nirenberg, MD, MS and Patricia Kovatch, BS, both of the Icahn School of Medicine at Mount Sinai, for their clarifications regarding the EMR data. Drs. Nirenberg and Kovatch did not receive compensation for their contributions. We are also grateful thankful to the patients whose data this study is based on, as well as their diligent caretakers, such as family, doctors and nurses.

Data Access, Responsibility, and Analysis

Yadaw, Li, Iyengar and Pandey had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Supplementary Material

The values in parentheses indicate the number of times the feature was selected as top ranked. Also shown are violin plots representing the distributions of the values of the (B) O2SAT_min and (C) age features that were selected as top predictive features for all the four algorithms. The plot in (B) shows that the median value of O2SAT_min for the deceased group (79) was significantly lower (T-test P < 0.001) than that for the live group (92). Similarly, the plot in (C) shows that the median age in the deceased group (75) is significantly higher (T-test P < 0.001) than that in the alive group (56).

Acknowledgements

References

1.↵
Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis. 2020;20(5):533–4.
OpenUrl CrossRef PubMed
2.↵
Guan WJ, Ni ZY, Hu Y, Liang WH, Ou CQ, He JX, et al. Clinical Characteristics of Coronavirus Disease 2019> in China. N Engl J Med. 2020;382(18):1708–20.
OpenUrl CrossRef PubMed
3.↵
Richardson S, Hirsch JS, Narasimhan M, Crawford JM, McGinn T, Davidson KW, et al. Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the New York City Area. JAMA [Internet]. 2020 Apr 22. Available from: https://www.ncbi.nlm.nih.gov/pubmed/32320003.
4.↵
Alpaydin E. Introduction to machine learning: MIT Press; 2014.
5.↵
Wells BJ, Chagin KM, Nowacki AS, Kattan MW. Strategies for handling missing data in electronic health record derived data. EGEMS (Wash DC). 2013;1(3):1035.
OpenUrl
6.↵
Saeys Y, Inza I, Larranaga P. A review of feature selection techniques in bioinformatics. Bioinformatics. 2007;23(19):2507–17.
OpenUrl CrossRef PubMed Web of Science
7.↵
Demsar J. Statistical Comparisons of Classifiers over Multiple Data Sets. J Mach Learn Res. 2006;7:1–30.
OpenUrl CrossRef Web of Science
8.↵
Lever J, Krzywinski M, Altman N. Classification evaluation. Nature Methods. 2016;13:603.
OpenUrl
9.↵
Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; San Francisco, California, USA: Association for Computing Machinery; 2016. p. 785–94.
10.↵
Wu Z, McGoogan JM. Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72314 Cases From the Chinese Center for Disease Control and Prevention. JAMA. 2020;323(13):1239–42.
OpenUrl CrossRef PubMed
11.↵
Mehra MR, Desai SS, Kuy S, Henry TD, Patel AN. Cardiovascular Disease, Drug Therapy, and Mortality in Covid-19. N Engl J Med. 2020.
12.↵
Health NYSDo. COVID-19 Tracker: Fatalities 2020 [Available from: https://covid19tracker.health.ny.gov/views/NYS-COVID19-Tracker/NYSDOHCOVID-19Tracker-Fatalities?%3Aembed=yes&%3Atoolbar=no&%3Atabs=n.
13.↵
Bhatraju PK, Ghassemieh BJ, Nichols M, Kim R, Jerome KR, Nalla AK, et al. Covid-19 in Critically Ill Patients in the Seattle Region – Case Series. N Engl J Med [Internet]. 2020 Mar 30 PMC7143164]. Available from: https://www.ncbi.nlm.nih.gov/pubmed/32227758.
14.↵
Prekker ME, Brunsvold ME, Bohman JK, Fischer G, Gram KL, Litell JM, et al. Regional Planning for Extracorporeal Membrane Oxygenation Allocation During COVID-19. Chest [Internet]. 2020 Apr 24 PMC7182515]. Available from: https://www.ncbi.nlm.nih.gov/pubmed/32339510.
15.↵
Chen T, Wu D, Chen H, Yan W, Yang D, Chen G, et al. Clinical characteristics of 113 deceased patients with coronavirus disease 2019: retrospective study. BMJ. 2020;368:m1091.
OpenUrl Abstract/FREE Full Text
16.↵
Grasselli G, Zangrillo A, Zanella A, Antonelli M, Cabrini L, Castelli A, et al. Baseline Characteristics and Outcomes of 1591 Patients Infected With SARS-CoV-2 Admitted to ICUs of the Lombardy Region, Italy. JAMA. 2020;323(16):1574–81.
OpenUrl PubMed
17.↵
Chen N, Zhou M, Dong X, Qu J, Gong F, Han Y, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395(10223):507–13.
OpenUrl CrossRef PubMed
18.↵
Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020;395(10223):497–506.
OpenUrl CrossRef PubMed
19.↵
Zavascki AP, Falci DR. Clinical Characteristics of Covid-19 in China. N Engl J Med [Internet]. 2020 Mar 27; 382. Available from: https://www.ncbi.nlm.nih.gov/pubmed/32220202.
20.↵
Sanders JM, Monogue ML, Jodlowski TZ, Cutrell JB. Pharmacologic Treatments for Coronavirus Disease 2019 (COVID-19): A Review. JAMA [Internet]. 2020 Apr 13. Available from: https://www.ncbi.nlm.nih.gov/pubmed/32282022.
21.↵
Colson P, Rolain JM, Lagier JC, Brouqui P, Raoult D. Chloroquine and hydroxychloroquine as available weapons to fight COVID-19. Int J Antimicrob Agents. 2020;55(4):105932.
OpenUrl PubMed
22.↵
Ruan Q, Yang K, Wang W, Jiang L, Song J. Clinical predictors of mortality due to COVID-19 based on an analysis of data of 150 patients from Wuhan, China. Intensive Care Med [Internet]. 2020 Mar 3 PMC7080116]. Available from: https://www.ncbi.nlm.nih.gov/pubmed/32125452.
23.
Zhou F, Yu T, Du R, Fan G, Liu Y, Liu Z, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet. 2020;395(10229):1054–62.
OpenUrl CrossRef PubMed
24.
Gao L, Jiang D, Wen XS, Cheng XC, Sun M, He B, et al. Prognostic value of NT-proBNP in patients with severe COVID-19. Respir Res. 2020;21(1):83.
25.↵
Du RH, Liang LR, Yang CQ, Wang W, Cao TZ, Li M, et al. Predictors of Mortality for Patients with COVID-19 Pneumonia Caused by SARS-CoV-2: A Prospective Cohort Study. Eur Respir J [Internet]. 2020 Apr 8 PMC7144257 Liang has nothing to disclose. Conflict of interest: Dr. Yang has nothing to disclose. Conflict of interest: Dr. Wang has nothing to disclose. Conflict of interest: Dr. Cao has nothing to disclose. Conflict of interest: Dr. Li has nothing to disclose. Conflict of interest: Dr. Guo has nothing to disclose. Conflict of interest: Dr. Du has nothing to disclose. Conflict of interest: Dr. Zheng has nothing to disclose. Conflict of interest: Dr. Zhu has nothing to disclose. Conflict of interest: Dr. Hu has nothing to disclose. Conflict of interest: Dr. Li has nothing to disclose. Conflict of interest: Dr. Peng has nothing to disclose. Conflict of interest: Dr. Shi has nothing to disclose.]. Available from: https://www.ncbi.nlm.nih.gov/pubmed/32269088.
26.↵
Gong J, Ou J, Qiu X, Jie Y, Chen Y, Yuan L, et al. A Tool to Early Predict Severe Corona Virus Disease 2019 (COVID-19): A Multicenter Study using the Risk Nomogram in Wuhan and Guangdong, China. Clin Infect Dis [Internet]. 2020 Apr 16. Available from: https://www.ncbi.nlm.nih.gov/pubmed/32296824.
27.↵
Wynants L, Van Calster B, Bonten MMJ, Collins GS, Debray TPA, De Vos M, et al. Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal. BMJ. 2020;369:m1328.
OpenUrl Abstract/FREE Full Text
28.↵
Cleophas TJ, Zwinderman AH. Machine Learning in Medicine-a Complete Overview: Springer; 2016.
29.↵
Morde V, Setty VA. XGBoost Algorithm: Long May She Reign! 2019 [Available from: https://towardsdatascience.com https-medium-com-vishalmorde-xgboostalgorithm-long-she-may-rein-edd9f99be63d.
30.↵
Reinstein I. XGBoost, a Top Machine Learning Method on Kaggle, Explained 2017 [Available from: https://www.kdnuggets.com/2017/10/xgboost-topmachine-learning-method-kaggle-explained.html.