ABSTRACT
Introduction Risk factors of adverse outcomes in COVID-19 are defined but stratification of mortality using non-laboratory measured scores, particularly at the time of pre-hospital SARS-CoV-2 testing, is lacking.
Methods Multivariate regression with bootstrapping was used to identify independent mortality predictors in a derivation cohort of COVID-19 patients. Predictions were externally validated in a large random sample of the ISARIC cohort (N=14,231) and a smaller cohort from Aintree (N=290).
Results 983 patients (median age 70, IQR 53-83; in-hospital mortality 29.9%) were recruited over an 11-week study period. Through sequential modelling, a 5-predictor score termed SOARS (SpO2, Obesity, Age, Respiratory rate, Stroke history) was developed to correlate COVID-19 severity across low, moderate and high strata of mortality risk. The score discriminated well for in-hospital death, with area under the receiver operating characteristic values of 0.82, 0.80 and 0.74 in the derivation, Aintree and ISARIC validation cohorts respectively. Its predictive accuracy (calibration) in both external cohorts was consistently higher in patients with milder disease (SOARS 0-1), the same individuals who could be identified for safe outpatient monitoring. Prediction of a non-fatal outcome in this group was accompanied by high score sensitivity (99.2%) and negative predictive value (95.9%).
Conclusion The SOARS score uses constitutive and readily assessed individual characteristics to predict the risk of COVID-19 death. Deployment of the score could potentially inform clinical triage in pre-admission settings where expedient and reliable decision-making is key. The resurgence of SARS-CoV-2 transmission provides an opportunity to further validate and update its performance.
INTRODUCTION
Rapid and accurate prediction of the probability of adverse clinical outcomes is central to the management of global outbreaks of infection. [1-3] Stratification by predicted risk, most commonly for death, can support clinical judgement and potentially assist clinicians in community settings to decide how urgently to refer patients to hospital. Used appropriately, predictive scores can also help inform treatment-related decision making. The pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) lends itself to predictive modelling by having a large at-risk population and a high adverse event rate including death.
Although the recent incidence of COVID-19 has decreased in some parts of the world, many countries are already experiencing a ‘second wave’ of new cases. It is widely anticipated that viral transmission will continue to surge in the months ahead, particularly with the onset of winter in the northern hemisphere. Not all patients infected with SARS-CoV-2 will require hospitalisation but even among those who initially experience mild symptoms, a sizeable proportion remain at risk of subsequent life-threatening clinical decline. The availability of a practical pre-hospital predictive tool to triage patients for safe discharge to an outpatient (virtual) monitoring system versus direct admission to hospital for observation or treatment would be highly advantageous.
Reliable prediction tools to differentiate between levels and sites of clinical care already exist and have been successfully implemented in pre-hospital practice. For example, both the CURB-65 and CRB-65 scoring systems for the assessment of community-acquired pneumonia include recommendations for out-of-hospital care. [4, 5] Recent research by the ISARIC-4C Consortium has provided an accurate tool to similarly prognosticate for COVID-19-attributed death in hospitalised patients but its reliance on laboratory-measured indices limits its applicability outside the institutional environment. [6] Prognostic evaluation of individuals with suspected SARS-CoV-2 infection at the time of diagnostic testing is potentially achievable but has yet to be examined.
Our objective was to develop and evaluate an easy-to-apply and accurate prediction score that could identify, early in the illness course, which patients infected with SARS-CoV-2 might benefit from an urgent hospital assessment or could be safely monitored in their own home. To develop the initial risk score, we used multivariate logistic regression to explore the relationships between a large panel of candidate predictors and COVID-19 death. Iterative modelling resulted in a pragmatic predictive score based on five widely available patient variables. The scoring of patients against these selected predictors permitted three distinct risk classes to be defined. The performance of the score was then assessed against two validation cohorts – a large and randomly selected subgroup of the ISARIC study patients and a smaller single-hospital cohort, the latter to better reflect local population characteristics and practice.
METHODS
Study design and characteristics of the derivation cohort
All individuals aged 18 or over who tested positive for SARS-CoV-2 nucleic acid by real-time reverse transcriptase polymerase chain reaction between 1st March and 16th May 2020 after presenting to the Emergency Department (ED) at Watford Hospital, West Hertfordshire NHS Hospitals Trust were prospectively recruited. Baseline clinical characteristics and investigation results were collected according to a pre-specified protocol. Patients were either referred to the virtual hospital (VH) for outpatient monitoring or admitted to a medical ward. The study was approved by the National Health Service Health Research Authority (20/HRA/2344; ethics reference: 283888).
Laboratory, physiologic and radiographic data
All laboratory tests were performed as part of routine clinical care. Nasopharyngeal mucosal swabs for rRT-PCR were couriered to the regional UK Public Health England laboratory. Baseline vital observations included all the parameters recommended by the National Early Warning Score. [7] Chest radiographs acquired in ED were collated and scored at the end of the recruitment period.
Location and level of care
Following presentation, patients who were clinically judged to have mild illness were referred to the virtual hospital for subsequent monitoring. To avoid missing early clinical deterioration in the post-assessment period, they were observed for up to 24 hours in hospital. Patients who remained admitted after the first 24 hours but who did not require additional respiratory support beyond wall-based oxygen were managed on designated medical wards. Where clinically indicated, continuous positive airway pressure (CPAP) was provided on such wards or on the intensive care unit (ICU); intubation and mechanical ventilation were undertaken on the ICU.
Identifying predictors of death in the derivation cohort
The primary outcome of the study was hospital discharge or in-hospital death. TRIPOD recommendations were followed for multivariate model evaluation and reporting. [8] 75 baseline clinical and non-clinical variables were initially collected based on their reported association with COVID-19 and analysed by uni- and multivariate logistic regression with bootstrap resampling. [3-5,9] Candidate predictors of death were assessed for potential clustering effects and missing at random values were addressed by multiple imputation with chained equations (MICE), [9] with 10 – 20 random draws to account for data variability.
Development and external validation of the clinical risk score
The large external cohort comprised a randomly selected sub-population of the ISARIC 4C derivation population (N=20,000 provided; 14,231 with complete data for scoring). The primary data of these individuals were submitted by 260 hospitals across England, Scotland and Wales to the prospective ISARIC World Health Organization Clinical Characterization Protocol UK (CCP-UK) study. [6] We also tested our score against a smaller population of SARS- CoV-2-positive cases from Aintree Hospital, Liverpool (N=303 provided; N=290 with complete data for scoring) as a single-setting validation control.
All performance metrics against external cohorts were analysed using the prediction score and not the regression model. In the initial stages, a preliminary score comprising 12 independent predictors of death, including Care Home residency, was developed; to enable external validation against the ISARIC cohort (which did not include residential data), Care Home status was excluded as a variable to yield an 11-predictor score. Its ability to discriminate for in-hospital mortality was assessed by the area under the receiver operating characteristic (AUROC). From this score, a condensed version comprising 5 clinical predictors was developed for pre-hospital application. Mortality cut-points at each risk level were assessed to define mild, moderate and high risk classes, followed by determination of positive and negative predictive values as well as sensitivity and specificity thresholds. Model performance was further assessed by calibration using a graphical representation of the Hosmer-Lemeshow ‘goodness-of-fit’ test to depict agreement between the expected (predicted) and observed (actual) outcome across the entire COVID-19 severity range in both validation cohorts. The summary relationship between the dependent variable (death) and different levels of disease severity in the external ISARIC population was expressed as McFadden’s R2.
Statistical analysis
Categorical variables were expressed as frequency (%), with significance determined by the Chi-squared test. Continuous variables were expressed as median (interquartile range) or mean (standard deviation) and analysed by the t-test, Kruskal-Wallis or Mann-Whitney U test, as appropriate. Odds ratios (ORs) were assessed as unadjusted and adjusted values with respect to in-hospital death, the latter determined by multivariate regression with bootstrapping of 1000 resamples (10 Bootstrap). MICE was used to generate valid estimates of randomly missing values in the derivation model. A P value of <0.05 was considered statistically significant. All statistical analyses including risk modelling calculations were performed using STATA, version 16 (Stata Corp., Texas, USA).
RESULTS
Baseline demographic and clinical characteristics of the derivation cohort
983 patients (52.5% male) confirmed as SARS-CoV-2 rRT-PCR-positive were recruited over the 11-week study period. Five patients remained in hospital at the time of data censoring on 31st May 2020. The median age of the cohort was 70 (IQR: 53 – 83; range 23 – 99); median age was lowest in the virtual hospital pathway (53; IQR 43 – 67) and highest amongst hospitalised patients who did not receive CPAP (77; IQR 61 – 86) (P<0.001) (appendix 1).
The commonest co-morbidities were hypertension (48.4%), pulmonary disease (30.0%), cardiac disease (26.6%), diabetes mellitus (23.6%), chronic kidney disease (CKD; 20.0%) and dementia (15.4%). Obesity, defined as BMI >30, was present in 24.7% (243) of the cohort, and associated with an unadjusted OR for death of 1.40 (95% CI: 1.18 – 2.68, P <0.05).
Overall, 294/983 (29.9%) patients died in hospital, the vast majority (97.3%) aged 50 or over. The mortality rates of different age brackets in the cohort (compared to the ISARIC and Aintree validation cohorts) are shown in appendix 2. The univariate OR for death increased with rising age, and was 14.86 (95% CI: 6.89 – 32.04) for those aged 70 – 79 and 20.87 (95% CI: 9.93 – 43.86) for those aged 80 or older (table 1). When stratified by maximal levels of care, mortality rate was lowest in the VH (1.8%) and highest in the ICU group (62.1%) (P<0.001) (appendix 3).
White Caucasian ethnicity constituted 77.3% (760/983) of the whole cohort while Asian, Black and other minor ethnicities (BAME) represented 16.5%, 4.5% and 1.7% respectively. Overall, White ethnicity was associated with the highest proportion of non-survivors (85.0%); in comparison, patients of Asian (OR 0.57, 95% CI: 0.38 – 0.85, P<0.01) or Black (OR 0.39, 95% CI: 0.17 – 0.88, P <0.05) background in this cohort were not at a higher risk of death from COVID-19. The proportion of non-survivors within each ethnic group was also highest in White (32.9%), followed by Asian (21.6%), Black (15.9%) and other minority groups (17.6%) (P<0.01). Of note, White patients were significantly older by median age (74, IQR 58-85) compared to Asian (57, IQR 46-71; P<0.0001) or Black (58, IQR 50-72; P<0.001).
Care home residency (204/983; 20.8%) was more common amongst non-survivors (P <0.001) and was associated with an unadjusted risk of death of 3.14 (95% CI: 2.28 – 4.32, P<0.001). Based on frailty data from 644 patients aged 65 or older, the univariate OR of this predictor for death was 2.52 (95% CI: 1.73 – 3.69; P<0.001) in those with a group 1 frailty score and 2.56 (95% CI: 1.62 – 4.06; P<0.001) in those with a group 2 frailty score (table 1).
The median time from symptom onset to presentation for the derivation cohort was 6 days (IQR 2.0 – 11.0), with no difference between survivors and non-survivors. The four commonest reported symptoms were fever (61%), breathlessness (57.9%), cough (52.8%) and myalgia (21.7%). Tachypnoea (respiratory rate >24/minute) and hypoxia (SpO2 ≤92% on ambient air) were evident in 35.9% and 31.4% of patients respectively, and associated with crude OR for death of 2.15 (95% CI: 1.63 – 2.84, P<0.001) and 3.74 (95% CI: 2.73 – 5.12, P<0.001) respectively. C-reactive protein >50 mg/L was more frequently documented in non-survivors (76.7% vs. 58.5%; P<0.001) and associated with an unadjusted OR for mortality of 2.40 (95% CI: 1.73 – 3.32, P<0.001). Lymphopenia was similarly more common in non-survivors (44.7% vs. 29.1%; P<0.001). with an unadjusted OR for death of 1.97 (95% CI: 1.47 – 2.64; P<0.001). A baseline chest radiograph (CXR) was available in 91% (895/983) patients; abnormalities in ≥4 radiographic zones were evident in 338 (37.8%) of cases and was associated with increased mortality on univariate analysis (OR 1.89, 95% CI 1.42 – 2.52, P<0.001).
Multivariate regression for independent risk factors of mortality
The bootstrapped multivariable regression analysis included the whole derivation cohort of 983 patients comprising 689 (70.1%) survivors and 294 (29.9%) non-survivors with complete or multiply imputed values for data. Older age, CKD stage 5, baseline hypoxia, elevated BMI, tachypnea, leucocytosis and a history of stroke were identified as the strongest independent predictors of mortality (table 1). In particular, age ≥70 had the highest OR for death over the other individual variables. Following adjustment for co-variate effects, Care Home residency was excluded as a predictor of mortality; cut-point analysis also showed that individual blood markers were not independently predictive.
Iterative modelling to construct an 11-predictor and 5-predictor risk prediction scores
We used values from 770 patients with complete data in the derivation cohort to develop an initial 11-predictor score that ranged from 1 – 18 points (appendix 4). The lowest score of 1 point reflected the KDIGO categorisation of CKD. [11] In this score, the correlation between increasing COVID-19 severity and in-hospital mortality followed a linear dose response relationship, particularly between a score of 3 (below which no deaths occurred) and 12 (above which no patient survived), and an accuracy (AUROC) of 0.84 (figure 1). A shorter 5- predictor score based solely on clinical parameters and scaled from 0 – 8 points was also developed (table 2; figure 1). This short score, abbreviated to SOARS (SpO2, Obesity, Age, Respiratory rate, Stroke history), demonstrated an AUROC of 0.82 and was retained for further evaluation as a practical pre-hospital risk stratification tool.
External validation and performance of the SOARS score
The performance metrics of the long 11-predictor and 5-predictor (SOARS) scores were assessed by their ability to discriminate for in-hospital mortality against both the ISARIC and Aintree validation cohorts (table 3). The longer score showed higher discrimination in the Aintree (AUROC 0.87) than the ISARIC cohort (0.77). The SOARS score had a slightly lower AUROC against both cohorts, namely 0.80 (Aintree) and 0.74 (ISARIC). In comparison, the performance of other scores based solely on different cut-offs of age were associated with inferior discriminatory ability. Comparison of some of the main population parameters between the derivation and both external cohorts are shown in appendix 5.
The mortality rate at each level of the SOARS score in the derivation and both validation cohorts are shown in table 4. For increased applicability, the SOARS score results were further categorised into three risk classes – low (SOARS 0 – 1), moderate (SOARS 2) and high (SOARS ≥3) (table 5). Between 2.3 – 3.2% of patients scoring 0 or 1 (low risk) in each of the three cohorts died due to COVID-19 whereas 2.3 – 6.3% of those scoring 2 (moderate risk) failed to survive to discharge. Overall, 9 out of every 10 deaths in each of the derivation and validation cohorts had a SOARS score of 3 or greater. Within this broad range of increasing scores, the highest proportion of deaths was encountered at SOARS score 4 in both the derivation cohort and ISARIC validation cohort (28.3% and 34.5% respectively) and at SOARS score 5 in the Aintree validation cohort (30.9%).
Sensitivity thresholds calculated for the larger validation (ISARIC) cohort showed that the low risk class (SOARS 0 – 1) comprised 16.6% of patients with an in-hospital mortality of 5.4%, sensitivity and negative predictive value (NPV) for ruling out death of 99.2% and 95.9% respectively (table 6). By comparison, 29.8% of patients were classified within the moderate risk class; their mortality rate was 14.5%, sensitivity 97.0% and NPV 94.6%. The high risk class comprised 70.2% of the validation cohort who scored across a wide range of SOARS scores, The NPV and specificity for a prediction of death were thus more variable, from 90.5% and 38.7% respectively for a SOARS score of 3, to 69.8% and 99.8% respectively at the other end of the scale when the SOARS score was 7. Only one patient scored 8; they survived to discharge.
Reliability of the risk estimates in the ISARIC cohort, modelled as calibration or goodness-of-fit between expected (predicted) and observed outcomes using SOARS, showed a calibration slope of 0.70, calibration-in-the-large (CITL) 0.02 and an expected-to-observed (E:O) ratio of 0.990 (figure 2). Calibration was slightly improved in the Aintree cohort with a slope of 0.80, CiTL −0.16 and E:O ratio of 1.06. A LOWESS (locally weighted scatterplot smoothing algorithm) curve was generated to show differences between these outcomes in both cohorts. The plot characteristics suggested that the model, whilst demonstrating good concordance, had greater predictive accuracy in low to moderate risk patients whose predicted probability of mortality was under 40%. Conversely, overestimation of mortality risk was evident in patients in the high risk group.
DISCUSSION
We show that prognostic evaluation of a small panel of baseline clinical and demographic characteristics of patients with COVID-19 enables their subsequent risk of in-hospital death to be quantified across three strata of risk. Findings were obtained by applying the 5-predictor SOARS (SpO2, Obesity, Age, Respiratory rate, Stroke history) score to a large random sample of the ISARIC cohort and a smaller single-hospital cohort from Aintree, both with individual-level data. Our objective was to enable risk stratification to be undertaken early, ideally pre-hospitalisation, during the encounter with COVID-19 suspected individuals. This role is not met by currently available prediction tools that rely on laboratory measurements.
The SOARS score discriminated well for COVID-19 mortality and its simplicity obviated the need for complex calculations. It also retained good predictive accuracy in two external validation cohorts, with performance metrics that were primarily reflected in its high negative predictive values for mortality amongst patients with the lowest risk scores (0 or 1). This characteristic is consistent with a high accuracy for predicting a non-fatal outcome in its key target group, namely individuals with milder COVID-19.
Patients who score SOARS 0 or 1 could be discharged home with advice to re-establish urgent contact if their symptoms worsened. Patients stratified as moderate risk (SOARS 2) could be virtually monitored with a pre-defined plan for care escalation if specific thresholds relating to deteriorating symptoms or self-recorded SpO2 were triggered. Patients in the high-risk class (score ≥3) are highly likely to be symptomatic and would, in all probability, be referred directly to the ED for hospital-based management. Thus, the target individuals for the SOARS score are those with a low or moderate risk of COVID-19 mortality.
Our data concur with other reports that advanced age is the strongest predictor of death from COVID-19. [12-15] The sharp increase in mortality in patients within the derivation cohort who were in or beyond their seventh decade of life was reflected in the magnitude of their respective adjusted odds ratios for in-hospital death, namely 7.4 (aged 70 – 79) and 10.7 (aged ≥80). Even so, predictive models generated with age as the lone variable showed poorer discriminatory ability than the SOARS score.
The early development of physiological abnormalities in COVID-19 does not always result in timely clinical presentation. In our study, two measures of acute physiological perturbation proved to be important predictors of COVID-19 mortality: hypoxia and tachypnoea. Although persistent hypoxia is more common in non-survivors of COVID-19, its relationship with tachypnoea remains incompletely understood. [16,17] Fewer than half of patients with COVID-19 who present to hospital with decreased oxygen saturation report experiencing subjective breathlessness. [18-21] One reason for this observation might be so-called ‘silent hypoxia’ where a blunted symptomatic perception of the effects of hypoxaemia is apparent even when low arterial oxygen tension is evident. [22] This phenomenon may be responsible for delays in seeking clinical attention. Such danger could be mitigated by accurate risk assessment including the measurement and tracking of SpO2 in patients who are deemed to not require immediate hospitalisation. The absence of oxygen determination in the CURB-65 score has been cited as limiting its utility in stratifying COVID-19 patients for outpatient management. [23]
The SOARS score was constructed with data from hospitalised patients as the very low adverse event rate amongst non-admitted cases (for example, in the VH pathway) curtailed the development of a prognostic tool. This issue has previously been highlighted in the context of CRB-65 where low event rates in community studies of pneumonia made predictive inferences difficult to conclude. [4] Other scores that have been used in COVID-19 studies have either not been designed for this disease or have relied heavily on laboratory-measured data. [24-27]
Our multivariate regression model was bootstrapped to reduce overfitting but not penalized prior to external validation. In common with other severity scores for COVID-19, we dichotomized several continuous data parameters which may have potentially obscured non-linear effects between predictors and outcome, contributing to the difference in AUROC values between our derivation and validation cohorts and between both validation cohorts. [6, 24, 26] We also used in-hospital mortality as an unambiguous disease-related primary outcome rather than 30 or 60-day mortality. The better performance of the SOARS score in the smaller Aintree validation cohort compared to the much larger ISARIC cohort may have been due to its more homogeneous case-mix. This comparison suggested that, on balance, the simplicity of a pre-hospital risk prediction tool, provided it retained acceptable accuracy, may outweigh any minor diminution of its performance arising from improved practicality.
Other limitations in the study include the occurrence of missing information despite prospective data collection. The use of multiple imputation to estimate missing values for multivariate regression and the availability of nearly 85% of observations for constructing the risk stratification rule helped to mitigate against underestimating their role. The modest sample size of our derivation cohort was dictated by the incident caseload during the pandemic. However, selective sampling of the pandemic timeline was avoided by including all COVID-19 cases from the initial rise to the subsequent decline in new case numbers over the 11-week study period. Finally, reduced score calibration at the high-risk end suggests that SOARS may overestimate the probability of death in the highest risk cases. However, the principal objective of this score was to enhance frontline decision-making in patients with a low predicted risk of mortality at a time when demand for in-patient resources are likely to be high.
In summary, prognostication using the SOARS score can be undertaken concomitantly with SARS-CoV-2 diagnostic testing to inform clinical triaging, including decisions about the placement of the patient for ongoing care. Analysis of the ISARIC validation cohort in this study showed that between 16.6% and 29.8% (those scoring up to SOARS 1 or 2 respectively) could potentially have avoided admission provided a safe alternative to hospitalization was in place. Prospective studies of SOARS implementation will enable the score to be calibrated against other independent cohorts of COVID-19 patients to examine its performance under conditions that may be unique to different localities. Such an opportunity may soon present itself if SARS-CoV-2 transmission continues to increase in the UK and beyond.
Data Availability
Data sharing De-identified data can be shared on request after publication of the manuscript. Requests to be forwarded to ORCID ID: 0000-0001-7845-0173 (F.chua@rbht.nhs.uk)
Contributors
RV, AB, MK and RM developed the clinical algorithm and supervised patient management. FC, RV and AD conceived and designed the investigational plan. FC drafted the manuscript with contributions of intellectual content from RV, AD, LGS, PLM and AB. RV, TV, MK, RM, JS, LGS, ET, HM, SM, NM, SA, ML, AO, CP, RK, TJH, RT, SR, MS and JL collected the data at respective sites. AD, FC and RV examined the data and undertook statistical analyses. All authors have seen and approved the final version of the manuscript. RV is guarantor and attests that all named authors and contributors meet authorship criteria and that no others meeting such criteria have been omitted.
Funding
No funding sources, commercial or non-commercial, were involved in this study.
Competing interests
None declared.
Patient consent for publication
Not required.
Ethics approval
Ethical approval was provided by Stanmore Research Ethics Committee, London, England (IRAS ID: 283888). The study is registered with the National Health Service Health Research Authority under the reference 20/HRA/2344.
Pre-specified study protocol
Included in Supplementary Materials.
Checklist for the appropriate study type
STROBE and TRIPOD checklists are included as supplementary appendices.
Provenance and peer review statement
Not commissioned, externally peer reviewed. No part of this work has been written by a medical writer or published in printed or electronic form. Some of the findings of this study have been accepted for presentation at the British Thoracic Society Winter Meeting 2020 to be held in February 2021, at which point the abstract will be published in printed format in a supplement of Thorax. None of the results of this study have been posted to a preprint server.
SUPPLEMENTARY MATERIAL
Appendix. Pre-specified study protocol for clinical data collection
PREDICT COVID Clinical Data Collection protocol
Date: 1st March 2020
Aims
To prospectively collect data on all adults (> 18 years) with laboratory-confirmed SARS-CoV-2 infection (COVID-19) presenting to Watford Hospital, West Hertfordshire NHS Trust, during the first wave of the SARS-CoV-2 pandemic, including an outcome of in-hospital death or hospital discharge,
To develop a prognostic (risk prediction) score using the above derivation data,
To construct a practical clinical scoring system for predicting mortality (and risk of morbidity) after validation of score against external, i.e. independent cohorts of COVID-19 patients from other UK sites,
To include and assess outcomes of patients referred to the COVID-19 Virtual Hospital (out-of-hospital monitoring) in the same NHS Trust,
To monitor and characterise surviving patients for up to 12 months from the time of confirmation of SARS- CoV-2 infection, including the domains of psychological, physiological and radiological impairment and recovery.
Primary outcome
In-hospital mortality with minimum 30-day follow-up data.
Secondary outcomes
Longer term mortality and morbidity: radiological, psychological and cardiorespiratory. This will also be assessed against ongoing health care needs.
Patient inclusion
All adult patients (aged >18 years) with SARS-CoV-2 real-time reverse transcriptase polymerase chain reaction (rRT-PCR) confirmation.
Completed admission and outcomes at 3 months and 12 months.
Inclusion criteria
Readily available patient or clinical characteristic to attending clinicians upon presentation to hospital (Accident & Emergency department, Acute Medical Receiving Unit)
Blood markers should be commonly measured and results available for review within the first 24 hours of admission
All parameters relating to oxygen supplementation and advanced respiratory support including one or more of: continuous positive airway pressure (CPAP), bilevel non-invasive ventilation (NIV), high-flow nasal cannula (HFNC) oxygenation and invasive mechanical ventilation (IMV) needs
Exclusion criteria
All SARS-CoV-2 rRT-PCR negative patients irrespective of clinical suspicion of COVID-19. All individuals aged <18y.
Selection of candidate variables for initial data collection
Candidate variables were chosen based on knowledge of their potential association/s with SARS-CoV-2 infection and clinical disease (COVID-19). A systematic literature search was undertaken to identify these variables with respect to their predictive association for mortality and other adverse outcomes including COVID-19 severity and disease-related complications such as requirement for critical care and the development of COVID-19-associated acute respiratory distress syndrome (ARDS).
Systematic literature for English language articles in the following search databases: PubMed, EMBASE, WHO Medicus, Web of Science and Google Scholar (particularly for pre-print publications on medRxiv). Search terms included SARS-CoV-2; COVID-19; coronavirus; ARDS; pneumonia; sepsis; influenza; risk prediction; risk score; prognosis; validation. No date restrictions were imposed.
Statistical analysis for derivation and validation modelling
In analysing the data collected from the derivation (West Herts) cohort, categorical variables will expressed as frequency (%), with significance determined by the Chi-squared test. Continuous variables will be analysed for median (interquartile range) or mean (standard deviation) outcomes and analysed by the t-test, Kruskal-Wallis or Mann-Whitney U test, as appropriate. Missing data in the derivation cohort will be expected even with prospective data collection; missingness of data will be assumed to be at random and handled by multiple imputation by chained equations (MICE) with at least ten imputations, provided the proportion of missingness for the defined parameter constitutes no more than 20% of the cohort. Collated data will be subjected to univariate and multivariate logistic regression in order to determine odds ratios (OR) for in-hospital mortality. The latter will also be internally validated by bootstrapping using a minimum of 1000 re-samples. Predictor interactions will be analysed by appropriate methodology such as the likelihood ratio (LR) test comparing broad and narrow (constrained) models.
All performance metrics against external validation cohorts will be analysed using the prediction score and not with the multivariate regression model. The performance of the derivation model will be assessed for discriminatory ability (area under the receiver operating characteristic, AUROC) and calibration (graphical representation of Hosmer-Lemeshow analysis).
All statistical analyses including risk modelling calculations will be performed using STATA, version 16 (Stata Corp., Texas, USA).
R Vancheeswaran, F Chua, A Draper, T Vaghela and A Barlow (study design and responsible persons for data analysis and interpretation)
MARCH 2020
STROBE Statement—Checklist of items that should be included in reports of cohort studies
TRIPOD Checklist: Prediction Model Development and Validation
END
Acknowledgements
We are grateful to the healthcare teams whose efforts in the clinical field were fundamental to this work. We would like to thank all the patients involved for their vital their contributions. We are very grateful to the ISARIC Coronavirus Clinical Characterisation Consortium (ISARIC- 4C) Investigators in particular, J Kenneth Baillie (lead investigator) and Malcolm G Semple (chief investigator) for providing access to data for the external validation of our model.
Appendix 1. Baseline clinical characteristics of the derivation cohort
Appendix 2. In-patient mortality rate by age bracket in the derivation and validation cohorts
Appendix 3. In-patient mortality rate by level of care in the derivation cohort (N=983)
Appendix 4. 11-predictor score (1 – 18 points) for predicting in-hospital COVID-19 death
Appendix 5. Comparison of key parameters between the derivation and validation cohorts
Footnotes
Author affiliations and emails Chua F Consultant Respiratory Physician (F.chua{at}rbht.nhs.uk)
Vancheeswaran R Consultant Respiratory Physician (Rama.vancheeswaran{at}nhs.net)
Draper A Consultant Respiratory Physician (Adrian.draper{at}stgeorges.nhs.uk)
Vaghela T Information Governance team (T.vaghela{at}nhs.net)
Knight M Consultant Respiratory Physician (Matthew.knight{at}nhs.net)
Mogal R Consultant Respiratory Physician (Rahul.mogal{at}nhs.net)
Singh J Consultant Radiologist (Jaswinder.singh5{at}nhs.net)
Spencer LG Consultant Respiratory Physician (Lisa.spencer{at}liverpoolft.nhs.uk)
Erica Thwaite Consultant Radiologist (Erica.thwaite{at}liverpoolft.nhs.uk)
Mitchell H Core medical trainee (Harry.mitchell{at}nhs.net)
Calmonson S Core medical trainee (Sam.calmonson1{at}nhs.net)
Mahdi N Specialist Respiratory trainee (Noor.mahdi{at}nhs.net)
Assadullah S Core medical trainee (Shershah.assadullah{at}nhs.net)
Leung M Core medical trainee (Matthew.leung{at}nhs.net)
O’Neill A Medical student (Mzyao6{at}nottingham.ac.uk)
Popat C Core medical trainee (Chhaya.popat{at}nhs.net)
Kumar R Core medical trainee (Radhika.kumar3{at}nhs.net)
Humphries TJ Specialist Respiratory trainee (Thomas.j.humphries{at}doctors.org.uk)
Talbutt R Specialist Respiratory trainee (Rtalbutt{at}doctors.org.uk)
Raghunath S Specialist Respiratory trainee (Sarika.raghunath{at}nhs.net)
Molyneaux PM Consultant Respiratory Physician (P.molyneaux{at}imperial.ac.uk)
Schechter M Information Governance team (Miriam.schechter{at}nhs.net)
Lowe J Information Governance team (Jeremylowe{at}nhs.net)
Barlow A Consultant Respiratory Physician (A.barlow1{at}nhs.net)