Abstract
Purpose Although organ dysfunction is a defining element of sepsis, its trajectory is not well studied. We sought to identify whether there are distinct Sequential Organ Failure Assessment (SOFA) score trajectory-based subphenotypes in sepsis.
Methods We created 72-hour SOFA score trajectories in patients with sepsis from two diverse intensive care unit (ICU) cohorts. We then used Dynamic Time Warping (DTW) to compute patient similarities to capture evolving heterogeneous sequences and establish similarities between groups with distinct trajectories. Hierarchical agglomerative clustering (HAC) was utilized to identify subphenotypes based on SOFA trajectory similarities. Patient characteristics were compared between subphenotypes and a random forest model was developed to predict subphenotype membership, within 6 hours of ICU arrival. The model was then tested on the validation cohort.
Results A total of 4,678 and 3,665 unique sepsis patients were included in development and validation cohorts. In the development cohort, four subphenotypes of organ dysfunction were identified: Rapidly Worsening (n=612, 13.08%), Delayed Worsening (n=960, 20.52%), Rapidly Improving (n=1,932, 41.3%) and Delayed Improving (n=1174, 25.1%). In-hospital mortality for patients within different subphenotypes demonstrated distinct patterns over time. Similar subphenotypes and their associated outcome patterns were replicated in the multicenter validation cohort.
Conclusion Four novel, clinically-defined, trajectory-based sepsis subphenotypes were identified and validated. Trajectory based subphenotyping is useful for describing the natural history of sepsis in the ICU. Understanding the pathophysiology of these differential trajectories may reveal unanticipated therapeutic targets for patients with sepsis and identify more precise populations and endpoints for the predictive enrichment of clinical trials.
Introduction
Sepsis is defined as a dysregulated immunological response to infection that results in acute organ dysfunction [1]. The morbidity and mortality of sepsis remain high despite decades of research [2] and numerous failed clinical trials [3]. Recent research has highlighted that sepsis is a complex and heterogeneous syndrome, which includes a multidimensional array of clinical and biological features [4]. Identifying rigorous sepsis subphenotypes that present with similar prognostic markers and pathophysiologic features has the potential to improve therapy [5-8].
Most prior sepsis subphenotyping studies use static measurements available soon after admission to the emergency department or intensive care unit (ICU) to characterize patients [4, 9-11]. However, due to the stochastic nature of the initiation of infection and variable presentation to health care after developing symptoms, static assessments of sepsis subphenotypes may be incomplete, given the dynamic nature of the immune response and evolution of organ failure in sepsis [12].
More recently, dynamic trajectories of patient temperature have been identified in sepsis. The differential pattern of temperature change may represent a varied underlying inflammatory response to infection [1]. Whether multisystem organ failure develops distinct phenotypic patterns over time in sepsis remains largely unexplored. Identifying distinct organ dysfunction trajectories in sepsis can refine our understanding of the natural history of sepsis in the ICU in response to standard of care treatment and define patterns of disease that may benefit from novel therapeutic strategies [13].
In this study, we examined whether there are distinct subphenotypes of organ dysfunction trajectory over the first three days of an ICU admission with sepsis. We used longitudinal SOFA scores and Dynamic Time Warping (DTW) [14] to evaluate patients’ organ dysfunction trajectory similarities, as a surrogate for disease evolution. Clinical and biologic features including demographics, comorbidities, and inflammatory variables were used to characterize these subphenotypes at baseline. We then explored whether baseline features could predict these subphenotypes soon after ICU admission.
Methods
Overview
We did a cohort study on datasets from two ICU cohorts that contained granular patient level data from a total of twelve hospitals in the United States, whose overall workflow is illustrated in Fig. 1. Our goal was to derive sepsis subphenotypes of patients in ICU according to their SOFA organ dysfunction trajectories using Hierarchical Agglomerative Clustering (HAC). We then characterized these subphenotypes using comprehensive patient information including demographics, comorbidities, use of mechanical ventilation, type of ICU unit, admission source, organ source of sepsis, and examined their associated clinical outcomes as well as clinical biomarkers. We further built a random forest model to predict the derived subphenotypes from baseline patient clinical characteristics. To ensure reproducibility, the same analysis pipeline was conducted on the validation cohort.
Workflow of study
Definition of sepsis and study population
The development cohort was from the Medical Information Mart for Intensive Care III (MIMIC-III) database [15], derived from Beth Israel Deaconess Medical Center (BIDMC), from admissions dating from 2001-2012. BIDMC is a teaching hospital of Harvard Medical School in Boston, Massachusetts with 673 licensed beds, including 493 medical/surgical beds, 77 critical care beds, and 62 OB/GYN beds. The validation cohort was from the health system of Northwestern Medicine Enterprise Data Warehouse (NMEDW) [16], a network of eleven hospitals located in northern Illinois with 2,554 beds in total, with ICU admissions dating from 2012-2019. More details about both cohorts are shown in Supplemental Appendix 1. The inclusion-exclusion cascade for the patients in our study are shown in Supplemental Fig. S1, where Sepsis-3 criteria are defined as in Singer et al. [17].
SOFA score computation, model descriptions, and statistics analysis
The SOFA score was derived from six organ-specific subscores including respiration, coagulation, liver, cardiovascular, central nervous system (CNS), renal [13]. In our study, SOFA scores were derived every 6 hours within the first 72 hours of ICU admission. For each 6-hour period, the worst variable value was used to compute the SOFA subscores. Missing values were imputed using last observation carried forward (LOCF) or first observation carried backward (FOCB) [13].
After obtaining the SOFA trajectories of patients, we used DTW [14] to evaluate pairwise trajectory similarities, based on which the subphenotypes were identified through HAC [18]. McClain index [19] was used to determine the optimal numbers of subphenotypes.
After the subphenotypes were obtained, we used statistical testing to examine the associations between different clinical variables and biomarkers with the subphenotypes. Detailed list of clinical variables and biomarkers, as well as the concrete statistical testing techniques adopted, are shown in Supplemental Appendix 2.
Subphenotype prediction
We trained a random forest model to predict the derived subphenotypes from the baseline patient clinical characteristics within the first 6 hours after ICU admission, with the goal of examine whether the trajectory subphenotypes could be predicted early. Candidate predictors included demographics, comorbidities, SOFA subscores, lab tests, and vital signs. Overall prediction accuracy, along with the precision and recall each class are reported with the one-vs-rest scheme. Predictor contributions were evaluated with the Shapley additive explanations (SHAP) strategy [20].
Results
Basic cohort characteristics
For the development cohort (n=4,678), the median age was 65.93 years (IQR [53.71-77.88]). It included 2,625 male and 3367 white (71.98%). The overall in-hospital mortality rate was 10.86%, and the median ICU length-of-stay was 2.76 days (Interquartile Range (IQR) [1.59-5.57]). There were 1,893 patients (40.47%) treated with mechanical ventilation during the first three days. The mean baseline SOFA score obtained from the first 6 hours was 4.96 (Standard Deviation (SD): 2.80). Most of the patients (2,611, 55.81%) were in the medical intensive care unit (MICU). The demographic distribution of the validation cohort (n=3,665), is similar to the development cohort. The overall in-hospital mortality rate of 14.02% and the median length-of-stay was 3.84 days (IQR [1.98-7.94]). There were 1,524 patients (41.58%) that needed mechanical ventilation in the first three days. The mean baseline SOFA score was 5.68 (SD:2.81).
Comparisons between Survivors and Nonsurvivors
Nonsurvivors in development cohort were older than survivors, with a median age of 71.46 years (IQR, [59.87-80.93]) compared with 65.22 years for survivors (IQR, [53.18-77.35], p-value < 0.001). Nonsurvivors had higher comorbidity burden with a median Elixhauser index score [21] 7.0 (IQR [2.0-12.0]). Median ICU length-of-stay for nonsurvivors was 3.95 days (IQR [1.88-7.7]), and the rate of mechanical ventilation during the first three days was 59.84%. Nonsurvivors had higher baseline SOFA scores, with a mean value 7.10 (SD: 3.69). More nonsurvivors were admitted in MICU. More details about differences between survivors and nonsurvivors were shown in Supplemental Table S1, with similar statistics in validation cohort were shown in Supplemental Table S2.
SOFA trajectory and the derived subphenotypes
Based on the pairwise patients’ SOFA trajectory similarity matrix obtained from DTW, we generated clustermaps as in Supplemental Fig. S2, where four distinct clusters were identified through optimizing the McClain index [19].
Specifically, in the development cohort (Fig. 2(a)), Rapidly Worsening (n=612, 13.08%) was characterized by continuously increased SOFA scores from a mean (SD) of 4.52 (2.80) at admission to more than 7 at 72 hours. This subphenotype had the fewest patients. Delayed Worsening (n=960, 20.52%) was characterized by decreased SOFA scores within the first 48 hours from a mean (SD) of 5.23 (2.73) at baseline to 3.65 (2.83), followed by an increase over the last 24 hours. Rapidly Improving (n=1,932, 41.3%) was characterized by a consistent continuous improvement in SOFA scores from a mean (SD) of 5.54 (2.92) in baseline to less than 3. This was the most common and had the highest SOFA score at baseline. Delayed Improving (n=1174, 25.1%) was characterized by an increase and then a gradual decrease in SOFA score over 72 hours. It had the lowest SOFA score at baseline with mean (SD) 4.02 (2.36). Similar trajectory trends were obtained in the validation cohort (Fig. 2(b)) and the detailed analysis was provided in Supplemental Appendix 3. Individual SOFA subscore trajectories for each subphenotype were shown in Supplemental Fig. S3 and Supplemental Fig. S4.
Sequential Organ Failure Assessment (SOFA) trajectories of the subphenotypes and survival analysis in terms of the identified subphenotypes. A: Delayed Improving; B: Rapidly Improving; C: Delayed Worsening; D: Rapidly Worsening. The (a) and (b) describe the SOFA trajectories of the subphenotypes on development and validation cohorts, respectively. The (c) and (d) show the survival analysis results in subphenotypes on development and validation cohorts, respectively.
Patient characteristics comparisons across subphenotypes
Patient characteristics differed across the derived subphenotypes (see Table 1, Fig. 2(c) and (d), and Fig. 3). Specifically, Rapidly Worsening patients had the highest rates of mechanical ventilation (46.41%), the highest median Elixhauser comorbidity burden value of 5 (IQR [0-10]) but the lowest baseline SOFA score compared to the other subphenotypes. They had the worst mortality (Fig. 2(c) 28.27%, p-value<0.001) and longer length-of-stay (Table 1, 2.88 days, p-value<0.001). Rapidly Improving patients had the lowest rate of mortality (Fig. 2(c) 5.54%) and mechanical ventilation (37.94%), and the shortest length-of-stay (2.42 days). It had the highest proportion of patients meeting criteria for septic shock (15.48%, p-value=0.002). Delayed Improving and Delayed Worsening patients had lower rates of mortality (10.73%, 10.63%) and mechanical ventilation (42.50%, 39.27%) than the Rapidly Worsening subphenotype. The median age of the four subphenotypes were similar in the development cohort. Male patients were more common in all subphenotypes. More information was shown in Table 1. In addition, clinical biomarkers were different among subphenotypes. Chord diagrams (Fig. 3) showed these differences of subphenotypes in terms of abnormal clinical variables. The Rapidly Worsening group was more likely to have patients with abnormal cardiovascular biomarkers (bicarbonate, troponin T or I, lactate) and hematologic (such as hemoglobin, INR, platelet, glucose, RDW). Patients in this subphenotype had a higher chronic comorbidity burden and had abnormal SOFA subscores including respiration, coagulation and liver. The Rapidly Improving patients were more likely to have abnormal inflammatory lab values (temperature, WBC, bands, CRP, albumin, lymphocyte percent) and abnormal cardiovascular, CNS and renal SOFA subscores. There was a lower chronic comorbidity burden in this subphenotype. Delayed Worsening group had more abnormal hematologic and respiration, coagulation, CNS, and SOFA renal subscores. Abnormal respiration, coagulation, cardiovascular SOFA subscores were strongly associated with Delayed Improving. More information was shown in Supplemental Appendix 4 and Appendix 5.
Patient characteristics comparisons between subphenotypes
Chord diagrams showing abnormal variables by subphenotype. A: Delayed Improving; B: Rapidly Improving; C: Delayed Worsening; D: Rapidly Worsening; a: abnormal biomarkers vs. all subphenotypes; I: abnormal biomarkers vs. Delayed Improving; II: abnormal biomarkers vs. Rapidly Improving; III: abnormal biomarkers vs. Delayed Worsening; IV: abnormal biomarkers vs. Rapidly Worsening; b: abnormal subscores vs. all subphenotypes; V: abnormal subscores vs. Delayed Improving; VI: abnormal subscores vs. Rapidly Improving; VII: abnormal subscores vs. Delayed Worsening; VIII: abnormal subscores vs. Rapidly Worsening.
Subphenotype prediction
We trained random forest models for predicting the four subphenotypes according to early stage characteristics. Overall, for four subphenotypes, the prediction models obtained the accuracy of 0.75 (95% Confidence Interval [CI], [0.73, 0.76]) on development cohort and 0.79 (95% CI [0.77, 0.80]) on validation cohort. The precision and recall for each subphenotype on both cohorts are shown in Supplemental Table S6. Predictor contributions on both cohorts were shown in Fig. 4 and Fig. S5, which demonstrated different patterns when predicting different subphenotypes. For example, RDW, creatinine, bicarbonate and BUN contributes more for predicting the Rapidly Improving group, while platelet, INR, AST and lactate contributed more to the prediction of the Rapidly Worsening group.
Predictor contribution to the prediction of the predictive model by using the SHAP technique. Features’ importance is ranked based on SHAP values. In this figure, each point represented a single observation. The horizontal location showed whether the effect of that value was associated with a positive (a SHAP value greater than 0) or negative (a SHAP value less than 0) impact on prediction. Color showed whether the original value of that variable was high (in red) or low (in blue) for that observation. For example, in D, a low Platelets value had a positive impact on the Rapidly Worsening subphenotype prediction; the “low” came from the blue color, and the “positive” impact was shown on the horizontal axis. A: Delayed Improving; B: Rapidly Improving; C: Delayed Worsening; D: Rapidly Worsening.
Discussion
We reported four sepsis subphenotypes based on dynamic organ dysfunction trajectories using a novel data-driven methodology. DTW was used to calculate patients’ SOFA trajectory similarities because of its capability of capturing heterogeneous evolution among the temporal sequences more robustly, based on which HAC was leveraged to identify patient groups with similar trajectories. Four progression subphenotypes were identified, Rapidly Worsening, Delayed Worsening, Rapidly Improving, and Delayed Improving. Patients in the Rapidly Worsening subphenotype had progressive organ dysfunction over the time period.
The two Delayed groups had unstable organ dysfunction over the study period and the Rapidly Improving group had the highest admission organ dysfunction but quickly improved. Outcomes followed SOFA trajectory across each subphenotype irrespective of traditional septic shock categories, Rapidly Improving had the best outcomes and Rapidly Worsening the highest mortality. The Delayed Improving and Delayed Worsening had intermediate outcomes, including mortality, and length of stay.
The potential for distinct pathophysiologic etiologies for the differential trajectories is supported by the differential patterns of organ dysfunction, vital signs, inflammatory, hematologic, and cardiovascular variables at admission to the ICU. As shown in Fig. 3, and Supplemental Fig. S6 and S7, there were different variables that were associated with the groups over the course of the study. For example, those patients of Rapidly Improving were more likely to have more abnormal inflammatory markers (such as WBC, bands, CRP, albumin, temperature, lymphocyte) and more abnormal values on cardiovascular, and CNS subscores. There was a lower comorbidity score in patients with this subphenotype, which suggests that sepsis outcomes may be more dependent on underlying illness. The Rapidly Worsening patients had more comorbidities and distinct derangements in clinical variables associated with metabolic acidosis and hypoperfusion, e.g. a low bicarbonate and higher lactate, and disseminated intravascular coagulation, e.g. low platelets and a higher INR and respiratory failure. Both of the Delayed subphenotypes had a less specific group of variables associated with group membership, including inflammatory, hepatic, hematologic and pulmonary associated with Delayed Improvement and hematologic, cardiovascular and renal variables associated with Delayed Worsening.
Importantly, we built a multivariable prediction model for the identified trajectory subphenotypes from patient baseline characteristics and early-stage clinical features. Models were built on the first 6 hours after being admitted to ICU. Several interesting findings were obtained. For example, (1) A high comorbidity score tended to predict the subphenotypes of Rapidly Worsening because patients with high comorbidity score had multiple diseases and were more likely to present worse organ dysfunction in ICU. (2) The roles of lab tests and vital signs were different on prediction. For example, low Platelets had a positive impact on the Rapidly Worsening prediction and high Platelets had a positive impact on the Rapidly Improving prediction. The high bands and bilirubin tended to predict the subphenotype of Delayed Improving. The high values of lactate and INR tended to predict the subphenotype of Rapidly Worsening. (3) Some features such as BMI and lymphocyte count did not contribute to discriminating the subphenotypes significantly.
Our manuscript complements and adds to other recent study of sepsis subphenotypes. For example, Seymour et al. [4] and Knox et al. [9] each identified four subphenotypes that were associated with organ dysfunction patterns and clinical outcomes in patients with sepsis using a panel of baseline clinical variables. There is some overlap in our high risk groups, notably both include liver injury and shock. However, our work demonstrates that the difference in outcome in this group is due to progressive non-resolving organ dysfunction that calls for novel treatments. Bhavani et al. [1] identified novel longitudinal temperature trajectories to identify four sepsis subphenotypes, with significant variability in inflammatory markers and outcomes, highlighting the potential for novel immune signatures to be uncovered through trajectory analysis.
A major strength of this analysis is that we have identified time-dependent progression patterns that may be related to the differential response of specific organ dysfunction to standard of care interventions. For example, the Rapidly Improving group had cardiovascular and respiratory failure at admission that resolved over 72 hours. The Rapidly Worsening groups developed multisystem organ failure including visceral organ dysfunction, specifically liver failure in addition to cardiovascular and respiratory failure. These differential patterns suggest varying time-dependent, treatment responsive organ dysfunction pathophysiology in sepsis. The cardiovascular and respiratory subscores are driven by the vasopressor dose and PaO2/FiO2 respectively, which may respond to therapeutic interventions such as corticosteroids, volume resuscitation, and the application of PEEP or therapeutic suctioning. [22].However, as demonstrated by our analysis sepsis-related renal and liver failure may be less modifiable with our current therapeutic strategies over the past twenty years [23, 24]. Our study highlights that patterns of organ dysfunction in patients with sepsis are Rapidly Improving, Rapidly Worsening and Delayed. Each of these patterns may be due to a different pathophysiology and benefit from different treatments in the future.
It deserves noting that our Rapidly Improving patients had better outcomes across all patients studied, but still represented 21% and 36% of all deaths in our derivation and validation cohorts respectively, despite an overall 5% and 10% in-hospital mortality. This low mortality rate has implications for powering clinical trials [25] and further research is needed to understand the cause of death in patients with rapidly improving organ dysfunction in sepsis. The Rapidly Worsening subphenotype was the least common and may represent patients with our classical understanding of hyperinflammatory septic shock [26]. More recent evidence suggests that the pathophysiology of early, progressive organ dysfunction in our Rapidly
Worsening patients may be due to over exuberant activation of necroinflammatory cell death pathways in multiple organs, highlighting the need for novel treatment strategies [27-29]. The Delayed Worsening and Improving subphenotypes were common, had intermediate outcomes across our cohorts, and more nuanced differences in clinical characteristics. These trajectories may be influenced by non-resolving inflammation [30] or immune paralysis [31]. Notably, at 72 hours subjects in the Delayed Worsening subphenotype had diminished leukocyte counts and lymphocyte percentages compared with other groups, which may reflect pathologic T-cell mechanisms described in immune paralysis. These include increased expression of the inhibitory receptor programmed cell death protein 1 (PD-1) and clonal deletion of pathogen-specific cells – mechanisms that may ultimately result in attenuated cell proliferation and immune exhaustion [32]. Further understanding of the biology underlying these subphenotypes will be critical to develop the next generation of treatments for sepsis in all of its forms.
Limitations
This study has several limitations. First, our sepsis subphenotypes were identified based on the data-driven method, which may not be directly related to underlying differences in biology. Biologically derived subphenotypes may help refine our understanding of differential disease progression and the potential for therapeutics to alter the course. Second, because phenotypes were derived in the background of standard of care therapy, it is unclear whether specific processes of care influenced the development of these trajectories. We believe that the stability of these subphenotypes despite secular trends across eleven institutions spanning two decades adds to the generalizability of our findings. Third, although we used eleven separate hospitals in validation, each center is located in the United States and affiliated with an academic center, which may limit generalizability to other locations of care. Lastly, we did not evaluate the effect of specific randomized interventions on SOFA score trajectory.
Conclusions
We have discovered four novel SOFA score trajectory sepsis subphenotypes with different natural histories following admission to the ICU. Our results suggest that these four subphenotypes represent a differential host pathogen response in the setting of current standard of care therapy. Further understanding of the underlying biology of trajectory based subphenotypes may reveal insights into sepsis pathophysiology and improve the personalization of sepsis management and the predictive enrichment of clinical trials.
Data Availability
The data and code are available upon request.
Author contributions
ES, Yuan and FW for conceptualization, investigation, writing, reviewing and editing of the manuscript. ZX for data analysis, drafting, editing and reviewing manuscript. CM for data analysis, editing and reviewing manuscript. CS, IS, LT and DP for discussion, commenting and editing the manuscript.
Data and Code availability
The data and code are available upon request.
IRB Information
The research is approved by Institutional Review Board of Northwestern University STU00202840.
Declarations Conflicts of interest
ES reports a conflict for work after January of 2021 and have received personal fees for consulting work for Axle informatics for COVID vaccine clinical trials through NIAID. The other authors declare that they have no conflict of interest.
Acknowledgements
The work of ZX, CZ and FW are supported by NSF 1750326, NIH RF1AG072449 and NIH R01MH124740. The work of CM and YL are supported in part by NIH 1R01LM013337.