Abstract
Global healthcare systems are challenged by the COVID-19 pandemic. There is a need to optimize allocation of treatment and resources in intensive care, as clinically established risk assessments such as SOFA and APACHE II scores show only limited performance for predicting the survival of severely ill COVID-19 patients. Comprehensively capturing the host physiology, we speculated that proteomics in combination with new data-driven analysis strategies could produce a new generation of prognostic discriminators. We studied two independent cohorts of patients with severe COVID-19 who required intensive care and invasive mechanical ventilation. SOFA score, Charlson comorbidity index and APACHE II score were poor predictors of survival. Plasma proteomics instead identified 14 proteins that showed concentration trajectories different between survivors and non-survivors. A proteomic predictor trained on single samples obtained at the first time point at maximum treatment level (i.e. WHO grade 7) and weeks before the outcome, achieved accurate classification of survivors in an exploratory (AUROC 0.81) as well as in the independent validation cohort (AUROC of 1.0). The majority of proteins with high relevance in the prediction model belong to the coagulation system and complement cascade. Our study demonstrates that predictors derived from plasma protein levels have the potential to substantially outperform current prognostic markers in intensive care.
Trial registration German Clinical Trials Register DRKS00021688
Introduction
The COVID-19 pandemic has brought health systems around the globe to the brink of collapse. Capacities for intensive care treatment of patients with organ failure have reached their limits in many regions with intense SARS-CoV-2 transmission and were often central to political decisions regarding restrictions on public life, e.g. through contact restrictions or lock-downs. Various models for classification of disease severity and for prediction of clinical trajectories and outcome have been developed for COVID-19, based on laboratory measurements, clinical scores, imaging, and omics technologies [1–4]. These pointed to the importance of specific immune cells, inflammatory and antiviral cytokines and chemokines, as well as the coagulation cascade in COVID-19 disease progression [4–12]. They predict the risk of the future need for mechanical ventilation in the heterogeneous group of patients at early time points, e.g. at admission to the hospital, when clinical parameters and biomarkers differ substantially between mildly affected and severely ill patients [1–4,13].
Treatment decisions within the most severely ill patients, for instance whether a patient should be treated with extracorporeal membrane oxygenation (ECMO), have a major impact on resources. Currently such decisions are often based primarily on patient’s age, comorbidities, and established intensive care prognosis models, such as the Sequential Organ Failure Assessment (SOFA) or Acute Physiology and Chronic Health Evaluation (APACHE II), which assess the patient on the basis of a combination of established clinical and laboratory risk parameters [14,15]. The predictive values of both SOFA and APACHE II for the most critical forms of COVID-19 are limited [16–18], creating a diagnostic gap and imminent need for reliable predictors, specifically validated in severely ill COVID-19 patients, to guide and tailor efforts in treating these critically ill patients. Indeed, classifying clinical trajectories within more homogeneous groups such as WHO grade 7 patients is considerably more difficult to achieve than a molecular severity classification that distinguishes mild from severe patients; physiological and molecular differences are less pronounced within the same than between different severity groups. As a consequence, the relative impact of confounders and random environmental factors on molecular and physiological parameters for clinical decision making is stronger.
Plasma proteomics holds the promise of integrating the genetic background of an individual with their life history, physiological, nutritional, and demographic parameters, and hence, have the potential to form the foundation of a new generation of predictors [19–23]. Among the spectrum of proteomic technologies available, mass spectrometry has the appeal that once markers are identified, they allow for the direct generation of targeted panel assays measurable by selective reaction monitoring (SRM), simplifying their implementation into clinical routine. Recently, new mass spectrometry based proteomic technologies have been developed to increase throughput and measurement precision, so that the path from discovery to application is simplified [5,24–27].
We studied proteomes of two well characterized cohorts of the most severely ill patients with COVID-19 in two independent health care centers (Charité-Universitätsmedizin, Berlin Germany, and Medical University of Innsbruck, Austria) who gave informed consent to deep clinical and molecular phenotyping [13,18,28], We found 14 protein concentration trajectories that distinguish survivors from non-survivors. Moreover, a machine learning (ML) model, based on parenclitic networks, generated accurate prognosis on single time point samples that were collected once the patient reached the maximum treatment level, in median, 39 days before outcome. The ML predictor substantially outperformed established clinical risk scores.
Results
The exploratory cohort used for marker identification and model generation consisted of the 50 most severely ill COVID-19 patients out of a cohort of 168 patients with varying disease severity, treated between 15 March and 16 September 2020 at Charité University Hospital, Berlin, Germany (Fig. 1A) [13,18,28]. All 50 patients required intensive care with invasive mechanical ventilation plus additional organ support such as renal replacement therapy (RRT), ECMO, or vasopressors, corresponding to grade 7 on the WHO Ordinal Scale for Clinical Improvement. Patients with limitations of therapy according to their wish were excluded. There were no treatment restrictions due to shortages of intensive care capacity at the time of this patient cohort. Of the 50 patients, 36 (72%) required RRT, 19 (38%) patients were treated with ECMO, and 16 (32%) patients were treated with both RRT and ECMO. Fifteen (30%) patients died. Median time of hospitalization in survivors was 63 days (n=35, IQR 44-89). Median time from admission to death was 28 days (n=15, IQR 16-43). Patient characteristics are shown in Supplementary table 1.
The Charlson Comorbidity Index [29,30], performed poorly in classifying survivors from non-survivors by AUROC values of 0.63 (P = 0.16, Fig. 2A). From a time-resolved data resource for the PA-COVID-19 study, spanning over a compendium of clinical parameters, plasma proteomes, cell counts, enzyme activities, and outcomes [13], we further determined the SOFA and APACHE II scores. These scores, too, could not confidently distinguish survivors from non-survivors (Fig. 2A, AUROC = 0.68, P = 0.05 for APACHE II score at ICU admission and AUROC = 0.65, P = 0.11 for SOFA score at the time of first sampling at WHO grade 7).
Studying the plasma proteomes [13] we found 78 proteins the concentration of which significantly changed during the patients’ disease course. Out of these proteins, 14 were found to change differently over time for survivors and non-survivors (Fig. 1B, C). This included a significant increase in inflammatory proteins over time (SAA1, SAA2, CRP, ITIH3, LRG1, SERPINA1, SERPINA10 and LBP) in patients with fatal outcomes, and a corresponding decrease in survivors. Likewise, a decrease in anti-inflammatory proteins (SERPINA4, A2M) was noted in non-survivors but not in survivors, indicating a persistent pro-inflammatory signature in the former. Similarly, two key molecules in the coagulation system, thrombin (F2) and plasma kallikrein (KLKB1), known to be decreased in severe COVID-19 [11,13], further decreased over time in non-survivors while increasing in survivors.
For diagnostic purposes and treatment decisions time series data is however difficult to implement. We therefore explored the potential of using the earliest sample obtained at the maximum treatment level (WHO grade 7), i.e. at a time point critical for decisions about escalation of treatment, to predict the clinical outcomes (survival). The median time until the outcome was 39 (IQR 16 - 64) days. We established a machine learning model based on parenclitic networks, a graph-based approach in which networks representing the deviation of an individual from the population are derived [31,32]. The networks are generated (Methods) by considering every pair of analytes (proteins) individually and calculating the respective edge weight as the estimated probability of fatal outcome based on this pair of proteins. Predictive models are then generated by considering the topological differences between networks from individual cases (non-survivors vs. survivors). We achieved high prediction accuracy on the test subjects, who were excluded when training the machine learning model (in a cross-validation fashion, see Methods), with AUC = 0.81 for the receiver-operating characteristic (ROC) curve (Fig. 2B). Out of the 25 proteins with the highest relevance in the parenclitic model, 15 are components of the coagulation system and 8 proteins belong to the complement cascade (Supp. Table 2).
To independently validate the proteomic predictor, we examined its performance on an independent cohort of 24 patients with critical COVID-19 from Austria (survival n=19, death n=5, median time between sampling and outcome 22 days, interquartile range 15 - 42 days) (‘Innsbruck’ cohort, Methods). Despite the validation cohort originating from a different hospital and health care system, the machine learning model demonstrated high predictive power on this independent cohort (AUROC = 1.0, P = 0.00038, Fig. 2C). Using the cutoff value for survival prediction derived from the Charité cohort, the model correctly predicted the outcome for 18 out of 19 patients who survived and for 5 out of 5 patients who died in this independent ‘Innsbruck’ cohort.
Discussion
The prognostic value of several biomarkers (e.g. CRP, IL-6, ferritin) and clinical scores for predicting disease progression in COVID-19 at early disease stages, e.g. at hospital admission, is now well established [33,34]. For the comparatively homogeneous subgroup of severely ill patients already requiring mechanical ventilation and additional organ support, prediction of future disease trajectories and outcome (survival or death) is by far more challenging and only limited data exist [16,35,36]. Moreover, clinical severity scores are often not validated for unconscious patients and laboratory measurements are frequently confounded by intensive care treatment. Outcome of ICU patients may further be critically determined by resource constraints, the varying level of experience with organ replacement therapies or the rates of superinfection, rendering prediction complex [36]. On the other hand, patients in intensive care units, and particularly those in need of special organ replacement therapies such as ECMO, require a disproportionately large share of resources compared to other patients, so decisions to initiate such therapies should be based on the best information and assessment possible. Prognostic tools in critically ill patients are hence of crucial importance to guide and tailor the treatment efforts. This is particularly true in a situation when health care systems are overstrained.
Previously, we and others investigated plasma proteome alterations in COVID-19 [5–7,9,11,13], which show a remarkable ability to classify the severity of disease. New proteomic platform technologies have significantly gained precision and throughput compared to their predecessors, rendering the application of multivariate regression models more effective and bringing them increasingly close to routine clinical use [5]. Importantly, even without platform technologies, biomarkers identified in proteomic profiles can be translated into clinical use, e.g. by using standard techniques such as selective reaction monitoring (SRM) or ELISA.
Here, we show that an increase in specific inflammatory and acute phase proteins over time (e.g., SAA1;SAA2, CRP, ITIH3, LRG1, SERPINA1, and LBP) is associated with the risk of death from COVID-19, while an increase of kallikrein (KLKB1), kallistatin (SERPINA4), thrombin (F2), apolipoprotein C3 (APOC3), GPLD1, and the protease inhibitor A2M, is associated with survival. Kallikrein is involved in the blood coagulation system, fibrinolysis, and the complement cascade, three systems known to be dysregulated in COVID-19 [37–39]. It mediates the cleavage of kininogen to bradykinin and des-Arg9-bradykinin, a potent vasoactive peptide which is counter-regulated by ACE2, the cell entry receptor for SARS-CoV-2. Since the loss of ACE2 in COVID-19 supposedly leads to an imbalance of bradykinins, inhibition of the kallikrein-kinin system has been discussed as treatment strategy in COVID-19 [40–42]. This hypothesis is not supported by our data, which indicate improved prognosis with increasing kallikrein levels. Kallikrein is counterbalanced by kallistatin, which equally increased over time in survivors in our study population, thereby potentially equilibrating the increase in the kinin-kallikrein system. Kallistatin is known for pleiotropic effects in vascular repair, endothelial function, and inflammation [43] and possesses protective properties in acute lung injury. According to our data kallistatin should be considered as a potential candidate for clinical testing in critical COVID-19 [44].
While prognostic assessments based on repeated measurements over time allow for treatment monitoring, including evaluation of experimental therapies in clinical trials, prognostic measurements from single time points are particularly valuable for timely patient management and resource allocation. We therefore employed a machine learning model to integrate proteomic measurements from the first time point at WHO grade 7, i.e. invasive mechanical ventilation and additional organ support therapy, in order to derive prognosis of outcome. We achieve high prognostic values, both in the exploratory cohort, as well as in a fully independent cohort.
The majority of proteins with the highest relevance for the machine learning predictor were components of the coagulation system and the complement cascade (Supp. Table 2). Both systems are known to be crucial for treatment and disease courses for severely ill COVID-19 patients [8,9]. This is particularly well illustrated by recent data from a multi-platform clinical trial indicating that a substantial proportion of patients with severe COVID-19 develop thromboembolic events despite therapeutic anticoagulation [45,46]. The protein with the highest relevance in our model is Fetuin-A (AHSG), which is known to be strongly downregulated in severe COVID-19 [9,13]. Of note, genetic polymorphisms associated with higher AHSG plasma concentrations were found to be protective in SARS-CoV-1 infection [47]. One important function of AHSG is regulation of inflammation through deactivation of macrophages [48], and there is emerging evidence that macrophages play a key role in pulmonary inflammation and dysfunction in COVID-19 [10,49–51].
In summary, we have leveraged the power of the proteome to address a problematic diagnostic gap in the prognosis of the most critical form of COVID-19, that is not covered by established clinical assessments, such as the SOFA or APACHE II score. We show that the proteome accurately predicts survival in critically ill patients with COVID-19, from samples that were collected 39 days in median before the outcome. The majority of proteins with high relevance in the model are components of the coagulation system and complement cascade, highlighting their critical role in progression and outcome of most severe COVID-19.
Methods
Charité patient cohort and clinical data
Patients included in this analysis are a sub-cohort of the Pa-COVID-19 study conducted at Charité - Universitätsmedizin Berlin, a prospective observational cohort study on the pathophysiology of COVID-19 as described previously [18,28]. All patients with PCR-confirmed SARS-CoV-2 infection that progressed to critical disease (WHO grade 7, i.e. invasive mechanical ventilation and additional organ support), were eligible for inclusion. Exclusion criteria included refusal to provide informed consent by the patient or a legal representative, and any condition prohibiting serial biosampling. Patients were treated according to current clinical guidelines. Patients for whom limitation of therapy was decided according to the patient’s wish were excluded from analysis. In three further cases, limitation of therapy was decided at a later time point according to the patient’s presumed wish and predictably unfavorable outcome. All other patients received maximum intensive care treatment including organ replacement therapies at the discretion of the responsible physicians. One patient (ID 135), who was still hospitalized and clinically improving 5 months after admission, was classified as a survivor. Patients still in critical condition 5 months after admission were excluded due to uncertain outcome.
Biosampling of EDTA plasma for proteome measurement was performed up to 3 times per week after inclusion. Disease severity was assessed according to the WHO ordinal scale for clinical improvement (World Health Organisation 2020). Clinical data were in SecuTrial®. Pseudonymized data exported from SecuTrial® were processed using JMP Pro 15 (SAS Institute Inc., Cary, NC, USA).
Innsbruck Patient cohort and clinical data
Serum samples from patients admitted to the intensive care unit at the Department of Medicine, University Hospital of Innsbruck with PCR-confirmed severe COVID-19 were collected within the first days (median 7.5, IQR 5-12) after admission, and written informed consent was obtained. Patients were treated according to national guidelines. The study was approved by the local ethics research committee EK-Nr. 1107/2020, and EK-Nr. 1103/2020 for follow-up.
Statistical analysis and multiple-testing correction
Statistical testing on proteomic and diagnostic data [13] was performed in the R environment for statistical computing, version 3.6.0 [52]. All protein measurements were first log2-transformed and only protein groups matched to at least three different peptides were considered. Quantities of gene products corresponding to open reading frames IGxx (i.e. different types of immunoglobulin chains) were summed together to generate quantities representative of the overall levels of immunoglobulin classes (IGHVs, IGLVs, etc). Significance testing for equal medians was performed using the Mann-Whitney U test, as implemented in the “wilcox.test” function of the “stats” R package. Multiple-testing correction was performed using the Benjamini-Hochberg false discovery rate controlling procedure [53], implemented in the “p.adjust” function of the “stats” R package. Adjusted p-values below 0.05 were considered significant.
Identifying omics trajectories that are predictive of survival at the peak period of the disease
For each omics feature, the difference between its log2-levels at the last and the first sampling timepoints during the peak period of the disease was considered. This period was defined as the time when the patient was receiving the most intensive treatment during their stay in hospital, that is the time when the patient was at WHO grade 6 or 7. The distribution of this difference between survivors and non-survivors was compared using the Mann-Whitney U test. Only non-DNI patients with known outcome were included.
Prediction of survival
The first time point measured at the WHO grade 7 was selected per patient. To reduce the feature space used as input for the machine learning model, we limited it to the quantities of 57 proteins which are FDA-approved biomarkers with MRM assays available [54] and which were quantified with at least three different peptides in this study. Missing values were imputed using minimal value imputation, and the data were standardized.
Machine learning was carried out using the parenclitic networks approach [31,55]. Briefly, during training, for each pair of features, a radial SVM classifier is trained (using the svm() function from the “e1071” R package with default settings). For each sample, a network is then built, wherein vertices correspond to features and the edge weight is the death probability as predicted by the SVM classifier. Maximum, mean and standard deviation of the edge weights, as well as the numbers of edges with weights greater than 0.5 (i.e. fatal outcome is predicted) and nodes with at least one such edge are calculated. A LASSO classification model (alpha = 0.01) is then constructed on these 5 features using the glmnet() function of the “glmnet” [56] R package with default settings.
For the assessment of the classifier performance (Charité cohort), a cross-validation method was applied in the following way: the prediction was made for each sample by excluding (withholding) it from the dataset along with two other samples (chosen randomly with the constraint that out of 3 samples one corresponds to a non-survivor and two to survivors), training the classifier on the remaining (independent) samples and then generating predictions for the withheld samples using the trained model. Such a leave-3-out partition was generated randomly 50 times and the predictions for each sample were averaged. For the assessment of the performance on an independent dataset (Innsbruck cohort), the classifier was trained on all the Charité samples and used to estimate the probabilities of fatal outcome on the Innsbruck cohort. The source code is provided in supplementary materials.
The ‘relevance’ scores for proteins in the parenclitic model were calculated as Kleinberg’s authority centrality scores for the respective vertices in the “generalizing network”. This network was generated by (i) replacing edge weights greater than 0.5 with 1.0 and weights less than 0.5 with 0.0 in the networks corresponding to non-survivors and (ii) averaging the resulting networks.
Data Availability
The protein quantities table along with the associated metadata are provided in supplementary materials. All scripts used to train and assess the machine learning models are likewise provided.
Data and Code Availability
The protein quantities table along with the associated metadata are provided in supplementary materials. All scripts used to train and assess the machine learning models are likewise provided.
Study approval
The study was approved by the ethics committee of Charité - Universitätsmedizin Berlin (EA2/066/20) and conducted in accordance with the Declaration of Helsinki and guidelines of Good Clinical Practice (ICH 1996). Written informed consent was obtained from all patients or legal representatives according to regulations set by the ethics committee of Charité - Universitätsmedizin Berlin. The study is registered in the German and the WHO international registry for clinical studies (DRKS00021688).
Acknowledgements
We thank Jan-David Manntz (Beckman, Germany) for help with the Biomek i7, Robert Lane, Jean-Baptiste Vincedent and Nick Morrice (SCIEX) for help with the TripleTOF 6600. This work was supported by the Berlin University Alliance (501_Massenspektrometrie, 501_Linklab, 112_PreEP_Corona_Ralser), by UKRI/NIHR through the UK Coronavirus Immunology Consortium (UK-CIC), the BMBF/DLR Projektträger (01KI20160A, 01ZX1604B, 01KI20337, 01KX2021), Charité-BIH Centrum für Therapieforschung (BIH_PA_covid-19_Ralser), the BBSRC (BB/N015215/1, BB/N015282/1), the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001134), the UK Medical Research Council (FC001134), and the Wellcome Trust (FC001134 and IA 200829/Z/16/Z). The work was further supported by the Ministry of Education and Research (BMBF), as part of the National Research Node ‘Mass spectrometry in Systems Medicine (MSCoresys), under grant agreement 031L0220A. Leif Erik Sander is supported by the German Research Foundation (DFG, SFB-TR84 114933180) and by the Berlin Institute of Health (BIH), which receives funding from the Ministry of Education and Research (BMBF). Martin Witzenrath is supported by grants from the German Research Foundation, SFB-TR84 C06 and C09, by the German Ministry of Education and Research (BMBF) in the framework of the CAPSyS (01ZX1304B), CAPSyS-COVID (01ZX1604B), SYMPATH (01ZX1906A) and PROVID project (01KI20160A) and by the Berlin Institute of Health (CM-COVID). Stefan Hippenstiel is supported by the German Research Foundation (DFG, SFB-TR84 A04 and B06), and the BMBF (PROVID, and project 01KI2082). Norbert Suttorp is supported by grants from the German Research Foundation, SFB-TR84 C09 und Z02, by the German Ministry of Education and Research (BMBF) in the framework of the PROGRESS 01KI07114. The study was further supported by Wellcome Trust (200829/Z/16/Z). The Generation Scotland study received core support from the Chief Scientist Office of the Scottish Government Health Directorates (CZD/16/6) and the Scottish Funding Council (HR03006), and is now supported by the Welcome Trust (216767/Z/19/Z). Archie Campbell is funded by HDR UK and the Wellcome Trust (216767/Z/19/Z). Caroline Hayward is supported by an MRC University Unit Programme Grant (MC_UU_00007/10) (QTL in Health and Disease). Riccardo Marioni is supported by an Alzheimer’s Research UK project grant (ARUK-PG2017B-10). H. Whitwell, JF. Timms, A. Zaikin and T. Nazarenko are supported by a Medical Research Council grant (MR/R02524X/1) and H. Whitwell, A. Zaikin and O. Blyuss by the Ministry of Science and Higher Education agreement No. 075-15-2020-808. H. Whitwell is supported by the National Institute for Health Research (NIHR) Imperial Biomedical Research Centre (BRC). J. Timms is supported by the National Institute for Health Research (NIHR) UCLH/UCL Biomedical Research Centre. Mirja Mittermaier is a participant in the BIHCharité Digital Clinician Scientist Program funded by the Charité – Universitätsmedizin Berlin, the Berlin Institute of Health, and the German Research Foundation (DFG). Markus A. Keller is supported by the Austrian Science Funds (FWF; P33333) and the Austrian Research Promotion Agency (FFG, #878654). Figures were created with biorender.com.
Footnotes
↵□ PA-COVID-19 Study group, Charité – Universitätsmedizin Berlin
Malte Kleinschmidt, Katrin M. Heim, Belén Millet, Lil Meyer-Arndt, Nils B. Müller, Ralf H. Hübner, Tim Andermann, Jan M. Doehn, Bastian Opitz, Birgit Sawitzki, Daniel Grund, Peter Radünzel, Mariana Schürmann, Thomas Zoller, Fridolin Steinbeis, Florian Alius, Philipp Knape, Astrid Breitbart, Yaosi Li, Felix Bremer, Panagiotis Pergantis, Susanne Fieberg, Anne Wetzel, Moritz Müller-Plathe, Timur Özkan, Carola Misgeld, Dirk Schürmann, Bettina Temmesfeld-Wollbrück, Britta Stier, Martin Möckel, Jan A. Graaw, Victor Wegener, Marc Kastrup, Felix Balzer, Daniel Wendisch, Sophia Brumhard, Sascha S. Haenel, Philipp Georg, Claudia Conrad, Kai-Uwe Eckardt, Lukas Lehner, Jan M. Kruse, Carolin Ferse, Roland Körner, Andreas Edel, Steffen Weber-Carstens, Alexander Krannich, Saskia Zvorc, Linna Li, Uwe Behrens, Sein Schmidt, Maria Rönnefarth, Christina Pley, Claudia Fink, Chantip Dang-Heine, Robert Röhle, Emma Lieker, Christian Wollboldt, Yinan Wu, Georg Schwanitz, Constanze Lüttke, Denise Treue, Michael Hummel, Victor M. Corman, Christian Drosten, Christof von Kalle