ABSTRACT
The goal of this research is to explore quantitative motor features in critically ill patients with severe brain injury (SBI). We hypothesized that computational decoding of these features would yield important information on underlying neurological states and clinical outcomes. Using wearable microsensors placed on all extremities, we recorded 1,701 hours of continuous, high-frequency accelerometry data from a prospective cohort of patients (n = 69) admitted to the ICU with SBI. Models were trained using time-, frequency-, and wavelet-domain motion features and levels of responsiveness and outcome as labels. The two primary tasks were detection of levels of responsiveness, assessed by motor sub-score of the Glasgow Coma Scale (GCSm), and prediction of functional outcome at hospital discharge, measured with the Glasgow Outcome Scale–Extended (GOSE). Detection models achieved significant (AUC: 0.70 [95% CI: 0.53–0.85]) and consistent (observation windows: 12 min – 9 hours) discrimination of SBI patients capable of purposeful movement (GCSm > 4). Prediction models accurately discriminated SBI patients of upper moderate disability or better (GOSE > 5) with 2–6 hours of observation (AUC: 0.82 [95% CI: 0.75–0.90]). Results suggest that computational analysis of time series motor activity in patients with SBI yields clinically important insights on underlying neurologic states and short-term clinical outcomes.
INTRODUCTION
Despite advances in intensive care, the global burden of severe brain injury (SBI), in terms of mortality, long-term disability, and economic costs, is the highest among all major injuries1. Existing approaches to predict SBI outcomes, such as recovery of consciousness and functional independence, are imprecise for individual patients2 and can raise ethical concerns due to the potential for withdrawal of life sustaining-therapies3. At the same time, recent developments in artificial intelligence and big data processing represent an opportunity to optimize SBI patient monitoring with high-resolution, longitudinal waveform data and to improve the precision of SBI prognoses with flexible modeling strategies4. Hence, a key focus in the care of SBI is the discovery and validation of quantitative monitoring modalities that improve upon the accuracy and reliability of clinical characterization and the reliability of predicted outcomes5.
For acute neurological disorders, the assessment of motor function provides an important clinical window into neural systems associated with sensorimotor processing, emotion, coordination, planning, and learning6–8. Neurological damage and intensive care unit (ICU) practices (e.g., sedation, bedrest) are associated with a dramatic reduction in normal physical activity9, resulting in systemic pro-inflammatory signaling10 and an elevated risk of venous thromboembolism, infection, skin and soft tissue damage, delirium, and loss of muscle mass and strength11–14. A corollary is that structured programs designed to increase physical activity for SBI patients in the ICU can significantly reduce neurological complications and may lead to improved functional recovery15. However, it is uncertain whether the incorporation of continuous motion sensing in the ICU could yield clinically significant gains for SBI monitoring and prognosis.
Wearable accelerometers provide an objective and continuous assessment of motor activity over extended periods of time16. In contrast to most other motion sensing modalities17, integration of accelerometers in the ICU is feasible. Advances in microelectromechanical systems (MEMS) technology have made it possible to construct inexpensive, minimally obtrusive wearable accelerometers that can be optimized for the clinical space18. Accelerometers respond to changes in movement frequency and intensity, measure tilt from the gravitational axis, and produce little variation or drift over time18–22. The use of accelerometers to monitor gross physical activity in the ICU has already been tested with varying degrees of success23. Herein, we aim to more specifically determine whether a relationship exists between motion features derived from triaxial accelerometry time-series and neurological motor states and functional outcomes of SBI patients.
In this pilot study of the Neurological Injury Motion Sensing (NIMS) project, we explore the impact and limitations of high-resolution accelerometry in patients with SBI admitted to the ICU. We developed a matrix of wearable accelerometers to quantitatively capture motor activity from the extremities of SBI patients. Applying techniques from time-series analysis, dimensionality reduction, and logistic regression, we extract interpretable time-, frequency-, and wavelet-domain motion features and assess their performance in motor function detection and short- and long-term functional outcome prediction models. We then assess relative significance of the extracted features to determine how specific accelerometry profiles relate to clinically evaluated motor function and global outcomes. Finally, we demonstrate how accelerometry-based model outputs can potentially be used to monitor neurological transitions in clinical practice.
RESULTS
Study population characteristics
Of the 72 total SBI patients recruited in the ICU, 3 participants were excluded from the study due to withdrawn consent (n = 2) or corruption of accelerometry data during upload (n = 1), resulting in a study population of n = 69. Five patients were lost from one-year follow-up due to unsuccessful contact, and thus the study population at 12 months post hospital discharge was n = 64. Detailed characteristics of the study population are summarized in Table 1.
From each of the study participants, we collected triaxial accelerometry data (sampled at 10 Hz) from a wearable matrix of 6 sensors, placed on each elbow, wrist, and ankle and an additional sensor placed on the bed for external movement correction (Fig. 1a). The median recording duration per patient was 24.09 hours (IQR: 22.81–25.11 hours), and accelerometry data was recorded fairly uniformly across the stages of ICU stay in terms of proportion completed (S. Fig. 1d). In total, 1,701 hours of multisegmental accelerometry data were recorded.
During their stay in the ICU (median: 19 days, IQR: 11–29 days), study participants were evaluated with the Glasgow Coma Scale (GCS)24, 25 a median 9.25 times per day (IQR: 7.17–11.50 times per day). In total, we extracted scores from 14,240 GCS evaluations, 13,190 of which (92.63%) took place in the ICU and 653 of which (4.59%) coincided with accelerometry capture times. The trajectory of the motor component scores of the GCS (GCSm), along with corresponding times of accelerometry capture, of each patient included in our analysis is provided in S. Fig. 2.
Motor function detection performance
In this work, we use clinically evaluated GCSm scores extracted from electronic health records (EHR) as the primary markers of functional motor states. The scores of the 6-point GCSm are defined by best motor responses to physical stimuli and are outlined in Table 1.
We trained and evaluated threshold-level GCSm detection models from automated accelerometry-based motion features extracted from 19 varying observation windows directly preceding the GCSm evaluations (Fig. 1b). The count distributions of GCSm scores available for each observation window are listed in S. Table 1.
The receiver operating characteristic (ROC) curves of the optimally discriminating models at each GCSm threshold, along with their mean areas under the curves (AUC) and optimal observation windows, are shown in Fig. 2a. Based on the 95% confidence intervals of mean AUC, significant discrimination (AUC > 0.5, α = 0.05) was achieved by the extracted features at every threshold of GCSm except for GCSm > 2. However, only GCSm > 4 detection models achieve significant discrimination from shorter observation window durations (≤ 30 minutes); GCSm > 4 detection models achieve significant discrimination consistently with an observation window of 12 minutes or greater (Fig. 2b). The mean AUCs, along with 95% confidence intervals, at each threshold of GCSm is provided for all 19 tested observation windows in S. Table 2. As GCSm > 1, GCSm > 3, and GCSm > 5 detection models achieve significant discrimination at less than or equal to 3 different observation windows, only GCSm > 4 detection models achieve significant discrimination at a broad range of observation windows (12 min – 9 hours). Binary classification performance metrics of optimally discriminating motor function detection models are provided in Table 2. At none of the GCSm thresholds do the models achieve significantly greater accuracy than the proportion of the most represented class based on 95% confidence intervals. Only the GCSm > 4 detection model achieves a higher mean accuracy (0.71) and a significantly greater F1 score (0.78 [95% CI: 0.67–0.87]) than its proportion of the most represented (in this case, positive) class (0.66). Only GCSm > 4 and GCSm > 5 detection models achieved both a mean sensitivity and mean specificity over 0.5, but not significantly.
Functional outcome at hospital discharge prediction performance
We used clinically evaluated Glasgow Outcome Scale – Extended (GOSE) scores as the primary markers of functional outcomes, both at hospital discharge and at 12 months post discharge. The scores of the 8-point GOSE are outlined in Table 1.
We trained and evaluated threshold-level GOSE at hospital discharge prediction models from automated accelerometry-based motion features extracted from the same 19 varying observation windows directly preceding GCSm evaluations (Fig. 1b). The median lead window duration (i.e., time between end of observation window and hospital discharge) was 20 days (IQR: 10–33 days). The count distributions of GOSE scores, at discharge, available for each observation window are listed in S. Table 3. Given the low proportion of patients (1.45%) with good recovery (GOSE > 6) at hospital discharge, we limited our threshold-level analysis to GOSE > 1, GOSE > 2, GOSE > 3, GOSE > 4, and GOSE > 5.
The receiver operating characteristic (ROC) curves of the optimally discriminating models at each GOSE threshold, along with their mean areas under the curves (AUC) and optimal observation windows, are shown in Fig. 3a. Based on the 95% confidence intervals of mean AUC, significant discrimination (AUC > 0.5, α = 0.05) was achieved by the extracted features only at GOSE > 5. GOSE > 5 prediction models achieve significant discrimination at observation windows of two hours or greater, with a peak mean AUC of 0.82 (95% CI: 0.75–0.90) at an observation window duration of 6 hours (Fig. 3b). The mean AUCs, along with 95% confidence intervals, at each tested threshold of GOSE is provided for all 19 tested observation windows in S. Table 4.
Binary classification performance metrics of optimally discriminating functional outcome prediction models are provided in Table 2. At none of the GOSE thresholds do the models achieve a significantly greater F1 score than the proportion of the positive class or a greater mean accuracy than the proportion of the most represented class. Despite its strong discrimination performance, the GOSE > 5 prediction model achieves near-zero precision and sensitivity. While we observe an ideal precision recall curve for this model (S. Fig. 3a), the mean average precision is only 0.08 (95% CI: 0.02–0.18). This indicates, that while prediction probabilities for true positive cases are, on average, greater than prediction probabilities for true negative cases, they seldom cross the 0.5 threshold for proper classification (S. Fig. 3b).
Functional outcome at 12 months post discharge prediction performance
We trained and evaluated threshold-level GOSE at 12 (±1) months post hospital discharge prediction models from automated accelerometry-based motion features extracted from the same 19 varying observation windows directly preceding GCSm evaluations (Fig. 1b). The count distributions of GOSE scores, at 12 months, available for each observation window are listed in S. Table 5. The receiver operating characteristic (ROC) curves of the optimally discriminating models at each GOSE threshold, along with their mean areas under the curves (AUC) and optimal observation windows, are shown in S. Fig. 4a. Based on the 95% confidence intervals of mean AUC, significant discrimination (AUC > 0.5, α = 0.05) was not achieved by the extracted features at any of the GOSE thresholds. Mean AUC is largely independent of observation window duration at each of the thresholds (S. Fig. 4b). The mean AUCs, along with 95% confidence intervals, at each threshold of GOSE is provided for all 19 tested observation windows in S. Table 6. Binary classification performance metrics of optimally discriminating functional outcome prediction, at 12 months post discharge, models are provided in Table 2.
Calibration of motor function detection and functional outcome prediction
The probability calibration curves and associated prediction distributions of the optimally discriminating models at each threshold for GCSm detection and GOSE (at hospital discharge) prediction are provided in Fig. 4a and Fig. 4b respectively. We observe that the GCSm > 4 detection model achieves the best graphical model calibration of all those tested (Emax = 0.30 [95% CI: 0.08–0.64]). However, when considering the prevalence of predicted probabilities in calibration assessment with the integrated calibration index (ICI)26, we observe that the GOSE > 5 prediction model has the most ideal calibration (ICI = 0.01 [95% CI: 0.00–0.02]). The discrepancy between the weighted and graphical calibration of GOSE > 5 indicates a strong class imbalance, suggesting that more positive cases are necessary to train and recalibrate this model for proper classification. Probability calibration metrics of all optimally discriminating models are provided in Table 3.
Extracted feature and sensor placement analysis
At the end of our accelerometry processing pipeline (Fig. 1), we extracted eight unique feature types (Table 4) from each of the six accelerometers placed around SBI patient joints. For each of these 48 feature-sensor combinations, we calculate a relative significance score equivalent to the mean absolute value of the learned coefficients of supervised dimensionality reduction (i.e., the relative importance in explaining the variance in the dataset stratified by the endpoint) weighted by the absolute value of learned logistic regression coefficients (see Methods).
We consider the optimally discriminating configurations of the two most promising model types as representatives for motor function detection and functional outcome prediction respectively: (a) GCSm > 4 with a 6-hour observation window and (b) GOSE (at hospital discharge) > 5 with a 6-hour observation window. The feature significance scores of these two model types are visualized as heatmaps in Fig. 5a and Fig. 5b respectively.
For both motor function detection and functional outcome prediction, there is more variation in significance scores across feature types than across sensor placements. For motor function detection, the proportion of dynamic activity (PDA) in the observation window, the frequency-domain entropy (FDE), and the median frequency (MFR) are the three most significant feature types, descending in that order. For functional outcome prediction, the descending order of the three most significant feature types is FDE, MFR, and PDA. PDA is a crude measurement of overall physical activity27, while FDE enables differentiation between activity profiles which have simple acceleration patterns and those with more complex patterns16. From the pair of high-pass-filtered medians (HLF (h)) and low-pass-filtered medians (HLF (l)), HLF (h) has a significantly greater mean significance score than HLF (l) for every sensor placement in both model endpoints based on 95% confidence intervals. This, along with the relative significance of MFR, suggests that finer movements, captured in higher frequencies of accelerometry, can be more clinically significant in discriminating functional motor states and global outcomes from SBI. Moreover, the consistently strong significance of PDA, FDE, and MFR suggests that features of both the time domain (PDA) and the frequency domain (FDE, MFR) in combination may be useful for clinical assessments of functional neurological states.
In detecting motor function, the right wrist (RW) sensor was the most significant placement across the five most significant feature types. The trajectories of mean motion feature values in the six hours preceding GCSm evaluations (S. Fig. 5) visually demonstrate that features extracted from the wrist-placed sensors better discriminate cases of GCSm 5 and 6 from the rest of the GCSm scores. This follows clinical observations of a greater frequency of conscious movement in hands and wrists of bedridden SBI patients during ICU stay. Moreover, abnormal profiles of flexion and extension, associated with SBI, are most often observed in the wrists, and thus, the wrist-placed sensors may be more sensitive to abnormal patterns of movement, corresponding to lower levels of consciousness, than the elbow- or ankle-placed sensors.
In functional outcome prediction, we observe the greatest significance scores ascribed to wrist-placed sensors (RW and LW) in the most significant frequency-domain features (FDE and MFR), but the ankle- (RA and LA) and elbow-placed sensors have the greatest significance scores in the most significant time-domain features (PDA, SMA, and HLF (h)). Wrist movements are finer than elbow and ankle movements and may be best distinguished in the frequency-domain in relation to global outcomes.
The correlation of each of the extracted motion features across the six sensor placements is visualized in S. Fig. 6, and violin plots of the distributions of motion features, stratified by GCSm, are presented in S. Fig. 7.
Retrospective case study analysis of motor function detection in practice
Six patients in our study experienced a transition between GCSm > 4 and GCSm ≤ 4 within the GCS observations coinciding with 6-hour observation windows of accelerometry recording. For each of these six patients, we trained GCSm > 4 detection models on the remaining patient set with a shorter (27-minute) and a longer (6-hour) observation window. We return predictions with these models on the six case study patients every ten minutes to retrospectively examine the trajectories of probabilities against the recorded times of neurological transition (Fig. 6).
In case no. 2, we observed that both model types detect an upward transition in GCSm more than three hours before it was reported in the EHR. Likewise, the 27-min observation window model detected a downward transition in GCSm about an hour before the upcoming evaluation in case no. 4 and about two hours before in case no. 3. In cases no. 3, 4, and 6, we observed that the 6-hour observation window detects the appropriate transition in GCSm, but with a delay of 3–6 hours. In cases no. 1 and 5, in which we observe a shift and resettlement of GCSm within a 3–5-hour span, the 6-hour model fails to detect the transition while the 27-min model uncertainly oscillates above and below the midline. In general, the shorter observation window model was more dynamic and detected GCSm transitions quicker than the longer observation window model. However, persistent transitions, such as the one observed in case no. 6, were detected with more stability and reliability by the longer observation window model.
DISCUSSION
Key findings
We introduce an accelerometry-based based system in critically ill SBI patients that quantitatively captures multisegmental motor patterns correlating with clinical scores of motor responsiveness and functional outcome. The results reveal a significant (AUC = 0.70 [95% CI: 0.53–0.85]), consistent (observation windows: 12 min – 9 hours) association between extracted motion features and the discrimination of SBI patients capable of purposeful movement (GCSm > 4) and those who are not (GCSm ≤ 4) (Fig. 2a). A significant discrimination of purposeful movement was achieved with only 12 minutes of accelerometry recording (Fig. 2b), and reliable calibration (Fig. 4a) and informative classification (Table 2) for GCSm > 4 detection suggest that iterations of this system could be clinically useful in automating motor function monitoring. In case studies (Fig. 6), we demonstrate that accelerometry-based systems may detect transitions in motor function up to five hours before a clinical evaluation.
The utility of accelerometry-based features for functional outcome prognosis remains ambiguous. While we found no signal between motion features and long-term (12 months post discharge) outcomes (S. Fig. 4), the models accurately predicted functional status at hospital discharge (AUC = 0.82 [95% CI: 0.75–0.90]) at a cutoff of GOSE > 5 vs GOSE ≤ 5 for favorable vs unfavorable outcome (Fig. 3a). Patients with a GOSE of >5 have upper moderate disability or good recovery and are generally able to resume work or previous activities. However, given the small number of SBI patients with GOSE > 5 at hospital discharge, further validation is necessary to determine the reliability of this result. Conflicting results between the precision recall curve and average precision (S. Fig. 3a) and between different calibration metrics (Table 3) underline the class imbalance problem of GOSE > 5 in our dataset; at the same time, we find the consistent discrimination (Fig. 3b) and difference in outcome distribution (S. Fig. 3b) as promising markers for further exploration.
Finally, our analysis of feature significance (Fig. 5) reveals that both time-domain and frequency-domain features are important for motor function detection and functional outcome prediction. While sensors placed on the wrist achieved the greatest significance scores overall, particularly for features in the frequency-domain, multisegmental motion capture was validated by comparable significance scores of elbow- and ankle-placed sensors across the feature set.
Relationship with previous studies and future implications
Results presented here represent, to our knowledge, the first approach to relate motion sensor data to neurological states in SBI patients admitted to the ICU. Healthy activity classification with accelerometry-based features has become widespread, especially with advancements in MEMS technology, machine learning, and data sharing16, 28–30. However, applications to intrahospital care, and in particular intensive care, have been limited31, 32 and have largely taken only simple, threshold-based feature approaches to grossly evaluate motor activity (e.g., actigraphy)33–36. Reported success in these studies has been variable, but none of them have combined the high-resolution time-domain, feature-domain, and wavelet-domain analysis found in more recent healthy activity classification studies. The focus of our approach, on the relationship between motor profiles of SBI patients over extended periods of time and clinically relevant neurological states, is novel. Yet, it builds upon the developments in time-series analysis, dimensionality reduction, and supervised machine learning from activity classification projects as well as the hypotheses of the clinical validity and utility of accelerometry from applied, medical projects.
A continuous high-frequency motion capture system in the intensive care setting produces a high-dimensional dataset that is also valuable for data-driven research projects. Profiles of motor activity in SBI are poorly understood and decoding specific features of motion in the time-, frequency-, and wavelet-domains can open a window on internal neurological states. Accelerometry-based features may elucidate fundamental mechanisms underlying the strong association between physical activity and clinical outcomes, and we aim to collect more data in the NIMS project to enable the research and development of motion as a quantitative marker of functional recovery for SBI.
More generally, the critical care setting is a fertile ground for the development of advanced computational methods and applications of artificial intelligence for monitoring and decision support37. Patients are typically interfacing with physiological monitoring systems that generate a large volume of data whose complexity may overwhelm human interpretation alone but may be ideal for the training of analytical systems38. Since critical care specialists typically must make time-sensitive decisions for multiple patients at the same time39, we expect that a near-real-time computational framework assessing motion features alongside other time-series data continuously could provide valuable decision support. We expect that ongoing and subsequent iterations of this work will enable integration of computational physical activity features into the framework of monitoring and prognostication in the critical care setting.
Study limitations
We recognize several limitations in this work that need to be addressed. Our statistical analyses and retrospective validation of GCSm detection and GOSE prediction were performed on a limited sample size (n = 69 patients) from a single institution and intensive care facility. Further validation will require repeated trials on larger patient populations across multiple centers. There are also improvements to be done to the sensor itself. The planar dimensions of our currently used accelerometer (42 mm × 32 mm) can be reduced further to increase the resolution of localized motion capture. Furthermore, since accelerometry measurements depend on the orientation of the accelerometer with respect to the vertical (gravitational axis), additional modalities of motor output (i.e., gyroscopy and electromyography) could be integrated into the sensor system to inform computational models on the precise arrangement and neural activation of body segments. This would allow us to derive more physiologically relevant features that correspond to validated models of nervous system injury or disease6. We also recommend the development of sensors with higher sampling frequencies (≥40 Hz) to capture extremely fine or fast movements of digits or lower extremities. Additionally, GCSm itself has been criticized for lack of standardization among practitioners40, 41. GCSm scores for this work were extracted automatically from EHR and were measured from multiple practitioners across the Johns Hopkins Hospital Neurosciences Critical Care Unit (NCCU) staff. Moving forward, we aim to supplement clinical validation of the motion features with multifactorial associations with other consciousness, functional, cognitive, psycho-behavioral, symptomatic, and social outcome scales of SBI patients42.
METHODS
Study population and experimental protocol
This work was conducted with approval from the Johns Hopkins Medicine Institutional Review Board (IRB00135674) and written informed consent from patients or surrogates. We prospectively enrolled 72 patients admitted to the NCCU who met the following criteria: age ≥ 18 years, SBI defined as an acute brain injury or illness resulting in impaired consciousness, absence of injuries or lesions involving the extremities, and not expected to die or have withdrawal of life-sustaining therapies in the 24 hours following enrolment.
Patients were evaluated daily while in the NCCU, at hospital discharge, and at 12 months post discharge by research team members. All GCS evaluations during each patient’s hospital stay were automatically extracted from the institutional EHR system (Epic Systems, Madison, WI, USA). GOSE scores at hospital discharge were obtained by EHR review of discharge reports for patients who survived during hospital stay (n = 53). Patients were contacted by telephone 12 months (±1 month) after hospital discharge, and GOSE scores were obtained using a validated questionnaire43 (n = 27); in cases where patients could not be contacted, data was extracted from EHR reports (n = 8). Additionally, we identified participants who died between discharge and 12 months post-discharge from national obituary records (n = 12). Thus, we arrived at a 12-month post-discharge sample size of n = 64.
From the first 3 patients, we collected 10 hours of continuous triaxial accelerometry data, and for the remainder of the patients, we augmented our recording duration to between 24 and 48 hours of accelerometry data.
Instrumentation for accelerometry capture
Triaxial sensors (SensorTags CC2650, Texas Instruments, Dallas, TX, USA) were attached with transparent film dressing (Tegaderm Diamond Pattern 1686, 3M, Maplewood, MN, USA) bilaterally near the joints (with common orientation) designated in Fig. 1a. An additional sensor was placed vertically on the foot of the patient bed to detect patient-independent bed movements. Sensors were equipped with MEMS, variable capacitance tri-axial accelerometers (MPU-9250 MotionTracking Device, TDK InvenSense, San Jose, CA, USA) with sampling frequency (fs) set to 10 Hz, the range of measurable amplitude at ±16 g (±157 m/s2), and sensitivity at ±4,800 least significant bits per g (LSB/g).
The sensors transmitted data via a 2.4-GHz Bluetooth antenna to a portable Linux computer (RPi 3 Model B, Raspberry Pi Foundation, Cambridge, UK) placed in the NCCU room. We would execute a Python script on the computer to collect 3 channels (axes) of accelerometry time series from each of the 7 active accelerometers in parallel. The system would log interruptions on a separate .txt file in the instance of a sensor failure. During each trial, we also recorded a video stream (M1045-LW Network Camera, Axis Communications, Lund, Sweden) of the patient that clearly shows the location of each sensor. In the event of sensor interruptions, irregular movement profiles, or bed-sensor-extracted signal magnitude (SMA) values above 0.135 g27, we would check the footage to identify the source of these results.
Accelerometry processing and motion feature extraction
Each axial component of each sensor was convolved with a 4th-order Butterworth high-pass filter with a critical frequency of fc = 0.2 Hz (S. Fig. 8) to remove the baseline offset of accelerometry readings (Fig. 1a) and generally separate the low frequency effect of static orientation from the high frequency effect of active body movement44.
Filtered time-series were segmented into non-overlapping 5-second windows (~50 data points per window) for motion feature extraction. We selected the motion features listed in Table 4, which performed well in physical activity classification tasks16, to represent three different domains (time frequency, and wavelet). PDA is defined by the proportion of SMA over 0.135 g for each sensor in an observation window (Fig. 1b). The remaining features are defined by the following formulae for each 5-second window: where:
x, y, z represent the x-, y- and z-axes vectors, respectively, of the filtered accelerometry time series within the given 5 second window and xn, yn, zn represent the nth elements of these vectors.
N represents the length of each of the x, y, z vectors.
* represents the 1-dimensional convolution operator.
bh represents a 1-dimensional, 4th-order high-pass Butterworth filter with fc = 2.5 Hz.
bl represents a 1-dimensional, 4th-order low-pass Butterworth filter with fc = 2.5 Hz.
X, Y, Z represent the discrete Fourier transforms of the x, y, z vectors respectively where Xn, Yn, Zn represent the nth elements of these Fourier transform vectors and Xf, Yf, Zf represent the coefficients of the Fourier transforms that correspond to linear frequency f.
represent the vector of lth-level detail coefficients of the 5th-order Daubechies wavelet transform of the x, y, z vectors respectively.
Post-capture processing of accelerometry were performed offline using MATLAB (Version 9.8.0, MathWorks, Natick, MA, USA) with the Signal Processing, Wavelet, System Identification, and Symbolic Toolboxes.
Multiple imputation of missing motion features
Due to insufficient battery on the sensors, bedside interventions, interfering equipment, or patient migrations for surgery, imaging, or interunit transfers, a median 1.56% per sensor of each patient’s intended recording duration was missing in our dataset. Missing motion features were multiply imputed (m = 9) with a normal (features were normalized with the Box-Cox transform45) multivariate time-series algorithm from the ‘Amelia II’ package (v1.7.6)46 in R (v4.0.0)47. The algorithm exploits both spatial correlation (motion feature correlation across the sensors of the same participant) and temporal correlation (autocorrelation structures within each sensor’s time series) to stochastically impute missing time series values in multiple, independently trained runs. We formed subsequent statistical analyses on all 9 imputations to account for variation across imputation.
This model assumes the data is missing at random (MAR) (i.e., the pattern of missingness is independent of unobserved data48), which we validated by observing the independence of missingness from sensor placement or time of day (S. Fig. 9). A complete characterization of the missing data of each patient can be found in S. Table 7.
Correction of gross external movements
At time points where the bed-placed sensor SMA exceeded 0.135 g (a proposed threshold between static and dynamic activity27) and preceded a spike in extremity feature values (1.33% of the time), the bed sensor values of SMA, HLF, BPW, and WVL were subtracted from the extremity values and the bed sensor values of MFR and FDE were added to the extremity values. If a resulting correction value ended up out of a feasible range of static activity for the feature, we replaced the value with a random value, selected uniformly from the static activity range of that feature (S. Table 8).
Repeated k-fold cross-validation for unbiased model validation
The study population (n = 69) was partitioned 25 times with repeated k-fold cross-validation (5 repeats, 5 folds) into training sets (~80%, n ≈ 55) and validation sets (~20%, n ≈ 14) for each of the 19 tested observation windows (S. Table 1) for each of the three tested endpoints (Fig. 2b). In splits for motor function detection, patients were stratified by median GCSm over their available observations, while in splits for functional outcome detection, patients were stratified by GOSE scores. One of the nine missing value imputations was drawn with replacement for each partition.
Repeated cross-validation partitions were performed with the ‘caret’ package (v6.0-86)49 in R.
Motor function detection
We tested 19 unique observation window durations (S. Table 1) of accelerometry-derived features directly preceding GCSm evaluations (Fig. 1b). At each of these evaluation points, motion features were organized into matrices where each column represents a unique combination of motion feature type (8 total), sensor placement (6 total), and, for non-PDA features, time before the evaluation. Columns were normalized based on distributions of each placement-feature type combination (48 in total) in the training set. Normalized matrices underwent supervised dimensionality reduction with linear optimal low-rank projection (LOL)50 learned from the training set. Target dimensionality (d ∈ [2,20]) was tested as a model hyperparameter. Low-dimensional vectors of each d then underwent element-wise Yeo-Johnson transforms51 for scaled normalization (learned from the training set) and were used to train and validate logistic regression (‘glm’) models with binary endpoints at each GCSm threshold. All of these steps were performed in R.
Functional outcome prediction
The methodology for functional outcome prediction was identical to that of motor function detection except that GOSE thresholds instead of GCSm thresholds were used as endpoints.
Assessment of model performance and calibration on validation sets
Both motor function detection and functional outcome prediction models were trained and validated on each of the 25 repeated cross-validation splits for each of the 19 observation windows for each of the 19 unique target dimensionalities (d) for each of the endpoint thresholds (5 for GCSm, 5 for GOSE at discharge, 7 for GOSE at 12 months). Models returned binary prediction probabilities as well as a classification based on a probability threshold of 0.5 for each validation set observation.
Based on the validation set predictions, we calculated metrics of binary outcome discrimination performance (S. Table 2, 4, and 6), classification performance (Table 2), and probability calibration26, 52 (Table 3). We also visualized ROC curves (Fig. 2a, 3a, and S. Fig. 4a), probability calibration curves (Fig. 4), and, in one case, the precision recall curve (S. Fig. 3a) of the optimally discriminating (maximal AUC) models to assess discrimination, calibration, and case detection power respectively. We calculated unbiased mean values and 95% confidence intervals for both metrics and curves with bootstrap bias-corrected cross-validation (BBC-CV) with repeats53 on 1,000 resamples of the patient set across the validation set predictions. In this way, 95% confidence intervals account for the variation across the patient set, across the nine missing value imputations, and across the 25 repeated cross-validation partitions.
Feature significance scores
The coefficients (i.e., loadings) of the trained LOL projection matrix represent the relative importance of each column in explaining the variance in the dataset stratified by the endpoint50. Thus, we derived a relative importance score of each sensor-feature type combination for both motor function detection and functional outcome prediction by multiplying the mean absolute value of the loadings per each combination and the absolute value of the trained logistic regression coefficient of the corresponding reduced dimension. This would be performed across all 25 partitions of each combination of observation window, threshold, and endpoint. We then calculated 95% confidence intervals on feature significance scores by bootstrapping 1,000 resamples across the 25 repeated cross-validation folds and nine missing value imputations.
Data Availability
Per our current Johns Hopkins Medicine IRB protocol (IRB00135674), we are not permitted to share the clinical data collected for this study. However, we welcome all forms of collaboration, and urge interested investigators to contact the corresponding author (SB: sb2406{at}cam.ac.uk) with their institutional affiliation and proposed use of the dataset to submit a new protocol for access. The data may not be used for commercial products or redistributed in any way.
DATA AVAILABILITY
Per our current Johns Hopkins Medicine IRB protocol (IRB00135674), we are not permitted to share the clinical data collected for this study. However, we welcome all forms of collaboration, and urge interested investigators to contact the corresponding author (SB: sb2406{at}cam.ac.uk) with their institutional affiliation and proposed use of the dataset to submit a new protocol for access. The data may not be used for commercial products or redistributed in any way.
CODE AVAILABILITY
All code used in the data collection and analyses outlined in this manuscript can be found at the following GitHub repository: https://github.com/sbhattacharyay/nims (DOI: 10.5281/zenodo.4765305).
AUTHOR CONTRIBUTIONS
S.B. aided in the conceptualization of the study, developed the methodology of the experiments, acquired accelerometry data from patients, acquired funding for the project, performed statistical analyses on the data, visualized the results for publication, and wrote the complete manuscript. J.R. and R.E.C. aided in the conceptualization and data collection of this work and revised the manuscript. M.W., H.B.K., and E.J. aided S.B. in the statistical analysis, processing of data, and visualization of results. P.D. extracted neurological assessment scores from electronic health records. E.C. recruited patients for the study, performed clinical surveys, and collected clinical data from patient records. P.K. aided in the conceptualization of the study and the development of the methodology and established the data acquisition infrastructure. R.D.S. served as the principal investigator, conceptualized the study, aided in the development of the methodology, procured IRB approval for data collection from human subjects, aided in data collection, provided access to clinical resources at the Johns Hopkins Hospital, and revised the manuscript.
COMPETING INTERESTS STATEMENT
The authors declare that they have no conflicts of interest.
ACKNOWLEDGEMENTS
We graciously acknowledge the patients, families, NCCU nurses, and physicians who participated in and contributed to this study. S.B. would like to thank Kathleen Mitchell-Fox (Univ. of Cambridge) for reviewing and offering comments on the manuscript. We also wish to specifically thank Aditya Joshi (Rowan Univ.), Sanya Yadav (Univ. of Pittsburgh), Tobias Fauser (Univ. of Arizona), Michiru Fredricks (Johns Hopkins Univ.), Alexander Sigmon (Johns Hopkins Univ.), Shikha Gandhi (Johns Hopkins Univ.), and Joshua Vogelstein (Johns Hopkins Univ.) for their roles in the early development, data curation, and advising of statistical methodologies of the NIMS project.
This work was partially supported by awards from the Johns Hopkins University Office of the Provost and the Hodson Trust, received by S.B. S.B. is currently funded by a Gates Cambridge fellowship.