Abstract
Background Depressive symptoms are rising in the general population and can lead to depression years later, but the contributing factors are less known. Although the link between sleep disturbances and depressive symptoms has been reported, the predictive role of sleep on depressive symptoms severity (DSS) and the impact of anxiety and brain structure on their interrelationship at the individual subject level remain poorly understood.
Methods Here, we used 1813 participants from three population-based datasets. We applied ensemble machine learning models to assess the predictive role of sleep, anxiety, and brain structure on DSS in the primary dataset (n = 1101), then we tested the generalizability of our findings in two independent datasets. In addition, we performed a mediation analysis to identify the effect of anxiety and brain structure on the link between sleep and DSS.
Results We observed that sleep quality could predict DSS (r = 0.43, rMSE = 2.73, R2 = 0.18), and adding anxiety strengthened its prediction (r = 0.67, rMSE = 2.25, R2 = 0.45). However, brain structure (alone or along with sleep/anxiety) did not predict DSS. Importantly, out-of-cohort validations of our findings in other samples provided similar findings. Further, anxiety scores (not brain structure) could mediate the link between sleep quality and DSS.
Conclusion Taken together, poor sleep quality and anxiety symptoms could predict DSS across three cohorts. We hope that our findings incentivize clinicians to consider the importance of screening and treating subjects with sleep and anxiety problems to reduce the burden of depressive symptoms.
Introduction
In modern societies, about 25% of the general population presents depressive symptoms such as sadness, irritability, anhedonia, low motivation, distracted concentration, worthlessness, abnormal appetite, and sleep disturbance [1, 2]. Depressive symptoms have dramatically increased in general populations from 1991 to 2018, mainly in young women [3]. Recent findings during the COVID-19 pandemic observed that the prevalence of depressive symptoms increased about 3-fold compared to the earlier population-based estimates of the mental health [4]. Critically, depressive symptoms could predict major depressive disorder (MDD) around 15 years later in white adults [5]. Hence, screening subjects with depressive symptoms in the general population is essential for decreasing the rate, burden, and severity of depression [6]. In addition, a high conversion rate of depressive symptoms to MDD [5] and the noticeable health-related and economic burden of depressive problems in the general population [7] make it imperative to identify the associated behavioral and brain factors of depressive phenotype.
A human being’s life experiences ascertain mood impairment after night(s) of sleep disturbances, suggesting a robust link between poor sleep and depressive symptoms [8–10]. In particular, several meta-analyses suggested that sleep disturbance, and particularly insomnia, are critical risk factors for developing depression [11–14]. On the other hand, insomnia/hypersomnia are among the diagnostic criteria of MDD, suggesting a bidirectional association. Treatment of sleep problems reduces depressive symptoms and MDD [15–17], suggesting that targeting sleep quality is necessary for the management of depressive problems. The open question is whether depressive symptoms can be predicted based on sleep quality scores at the individual subject level and what are the underlying behavioral and brain factors to their association.
Anxiety is the most prominent mental condition that co-occurs with both sleep disturbance and depression [10, 18]. Moreover, a growing body of neuroimaging evidence highlighted the structural and functional brain alterations, mainly in the default mode and salience networks, on the interplay between sleep and depressive symptoms [19]. Using the Human Connectome Project in young adults (HCP-Young) cohort, Cheng and colleagues [20] demonstrated that increased functional connectivity between several brain regions mediates the association between depressive symptom severity (DSS) and sleep quality. Existing behavioral and neuroimaging studies on the link between sleep and depressive symptoms have used conventional statistical methods (mainly correlations) using a single cohort [19], which are prone to deliver poor generalizability in other samples. Thus, the “real world” challenge is a prediction of symptoms (i.e., DSS) in unseen data or independent samples to achieve generalization to future cases that can not be answered in traditional statistical approaches based on a single sample. Advanced machine learning (ML) predictive models provide hope to identify the role of contributing neurobehavioral factors in predicting depressive problems across various general populations samples, which is crucial for precision psychiatry and ultimately guiding clinical practice [21, 22]. Thus, aiming to address the reproducibility gap in the literature, we applied ML approaches in the HCP-Young dataset to make a predictive model for DSS based on sleep quality, anxiety, and gray matter volume (GMV). Based on the trained ML models in the HCP-Young, out-of-cohort validation of our ML algorithm was conducted on two independent large-scale datasets (the lifespan Human Connectome Project (HCP-Aging) and enhanced Nathan Kline Institute-Rockland sample (eNKI)) to understand the generalizability of our models across different cohorts. In addition, we aimed to understand the mediatory role of anxiety and GMV in the association between sleep quality and DSS in the HCP-Young sample.
Methods and materials
Participants
In this study, the HCP-Young dataset was the primary dataset acquired by the Washington University-University of Minnesota (WU-Minn HCP) consortium ffund [23]. Out of the 1206 participants, we selected 1101 participants who had structural MRI images and phenotypic data that we were interested in in this study, i.e., sleep quality, anxiety, and depressive symptoms. For out-of-cohort validation, we used 334 participants from the eNKI dataset (http://fcon_1000.projects.nitrc.org/indi/enhanced/) [24] and 378 participants from the HCP-Ageing dataset (https://www.humanconnectome.org/) [25]. Notably, participants of all datasets were from general populations.
Behavioral measures
Sleep quality
Sleep quality assessment was based on the self-reported PSQI questionnaire [27], which has 19 questions assessing sleep quality over a one-month period. The PSQI comprises seven components, namely subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, use of sleep medicine, and daytime dysfunction. The total PSQI score is a sum of these components. Of note, the higher total PSQI score (> 5) reflects poor sleep quality.
DSS
Depressive symptoms were measured based on the DSM-IV-oriented depressive problems portion of the Achenbach Adult self-report for age 18-59 [26]. This questionnaire has 123 items in general, and a total depressive score obtained from 14 depressive-related items, ranges from 0 to 28 points. The higher score reveals severe depressive symptoms. Notably, there are two sleep-related items in this questionnaire, which have been removed in our primary ML analyses and mediation analysis. These questions were “I sleep more than most other people during the day and/or night” and “I have trouble sleeping”. We calculated the total score of depressive problems after removing sleep-related items and used this total score in our analyses. Further, as a confirmatory analysis, we examined the original DSS involving these two sleep-related items.
Anxiety
Anxiety score was measured using six relevant items of DSM-IV-oriented Achenbach Adult self-report for age 18-59. None of these items are related to sleep or depressive problems. Similar to DSS, the total score of anxiety has been used in our study and a higher anxiety score shows more anxiety problems.
Calculation of gray matter parcel volume
T1 structural MRI images were acquired by Siemens 3T Skyra scanner and preprocessed using the WU-Minn HCP consortium pipelines [27]. We performed voxel-based morphometry (VBM) using the Computational Anatomy Toolbox (CAT12) [28], implemented in the Statistical Parametric Mapping (SPM12, https://www.fil.ion.ucl.ac.uk/spm/software/spm12/). During this process, we corrected bias-field distortions and after noise removal and skull striping, the images were normalized to standard space MNI-152. Then, we segmented the brain tissue into gray matter, white matter, and cerebrospinal fluid. Subsequently, we modulated the gray matter segments for the non-linear transformations performed during normalization to obtain the actual volumes. GMVs of the cortical, subcortical, and cerebellar areas were assessed using functionally-informed in-vivo atlases (400 cortical parcels from Schaefer atlas [29], 36 subcortical parcels from Brainnetome [30], and 37 cerebellar parcels from Buckner [31]), resulting in 473 brain parcels, as applied previously [32].
Statistical analyses
Prediction analysis in the HCP-Young dataset
Ensemble decision tree models were employed to predict DSS based on sleep quality using MATLAB R2020a software [33]. Ensemble methods of these models were LS-boost and bagging, which were applied as a hyperparameter to be selected by the algorithm. Further, we performed nested 10-fold cross-validation considering the family structure of subjects, in which twins and siblings were not separated in the training, validation, and test sets to avoid potential leakages. Age, sex, and total GMV were regressed out from features by using the parameters from the regression model estimated in the training sets for the test sets. Then, features of training sets were ranked and sorted (from the maximum importance to the minimum importance) by the relief method to enable the algorithm to select features based on the maximum rank [34]. After putting aside the validation sets, models were trained in each remained training set ten times with ten different feature numbers so that the number of features was also selected automatically based on the minimum error of prediction in the validation sets. In this stage, hyperparameters were optimized using Bayesian method [35], with 100 iterations. Finally, models with minimum prediction error in validation sets were selected and fitted on the entire training sets (training + validation) and used to predict unseen test sets. Thus, in the end, we had ten models (one model for each test set), and our ML pipeline could select different algorithms LS-Boost/bagging along with its hyperparameters and different feature numbers for each fold. These predictive models had 19 input features consisting of PSQI questions. Subsequently, we added anxiety (total score) and whole-brain GMV (n = 473) features to measure the role of anxiety and GMV in DSS prediction. Of note, we did not perform a feature section for models with just sleep quality features. Moreover, in several confirmatory analyses, we removed participants with a history of diagnosed depression, we included sleep-related items of DSS, and we used seven components of PSQI (instead of individual items) as described in the supplement.
Out-of-cohort validation in two independent datasets
We used two independent large-scale datasets to test whether the results of ML models using the HCP-Young dataset are robust and generalizable to other datasets (the eNKI and HCP-Aging). After training ML models on the HCP-Young dataset, we had ten models which we used to predict the individual DSS in other datasets and averaged the results of all ten models for each person. Of note, we did not tune our models nor perform cross-validation for two independent datasets. Put differently, we used these independent datasets solely for prediction and used the regression model of the primary dataset (HCP-Y) for regressing out age, sex, and total GMV in these datasets as well. All the phenotypic data (sleep quality, anxiety, and DSS) were obtained from the same questionnaires across the three datasets. Details of this analysis are provided in the supplement.
Mediation analysis
The structural equation modeling (SEM) using Amos 24.0 software [36] was applied to statistically model the underlying mechanisms of the link between total sleep quality and DSS scores. In this analysis, a latent variable from brain GMV has been made and used in models. Mediation analysis investigates how much of the covariance between two variables can be explained by the mediator variable(s). Age, sex, and total GMV were controlled in mediation analyses. More details of mediation analysis are provided in the supplement.
Results Demographics
The primary dataset of this investigation (HCP-Young) included 1101 participants (22–35 year, mean age = 28.79 ± 3.69, 54.3% female), 103 of whom (9%) had a history of DSM-IV-based depression episodes during their lifetime. The detailed demographic characteristics of participants are provided in Table 1. We had two other different datasets for out-of-cohort validation analysis i.e., the HCP-Aging and eNKI. We included 378 participants (36–59 year, mean age = 47.3 ± 7, 57.9% female) from the HCP-Aging dataset and 334 participants (18–59 year, mean age = 37 ± 13.8, 62% female) from the eNKI dataset.
Sleep and anxiety predicted DSS in the HCP-Young dataset
The details of ML pipeline for training and evaluation of models in the HCP-Young dataset are presented in Fig. 1. ML models based on sleep quality could predict DSS (unseen data during model training, r = 0.43, rMSE = 2.73, R2 = 0.18, CI = 3.33 – 3.50) (Fig. 2A). Adding anxiety score to sleep quality features improved the prediction drastically (r = 0.67, rMSE = 2.25, R2 = 0.45, CI = 3.33 – 3.57) (Fig. 2B). Whereas adding GMV features to the sleep quality (r = 0.41, rMSE = 2.76, R2 = 0.16, CI = 3.35 – 3.52) and combination of sleep quality and anxiety (r = 0.66, rMSE = 2.26, R2 = 0.44, CI = 3.37 – 3.40) did not improve their prediction (Fig. 2C,D). Furthermore, GMV alone could not predict DSS (r = 0, rMSE = 3.09, R2 = 0.05), but anxiety could predict DSS (r = 0.62, rMSE = 2.37, R2 = 0.38). The combination of GMV and anxiety (r = 0.61, rMSE = 2.38, R2 = 0.38) predicted DSS (eFigure 7 in the supplement). Based on the designed method, ML algorithm automatically selected different feature numbers in each fold, but the selected hyperparameter of the method for all folds of all models was LS-boost. To assess the robustness of our findings, we performed several confirmatory analyses. ML analysis after removing 103 participants with a history of depression showed similar results e.g., a combination of sleep quality and anxiety predicted DSS similarly (r = 0.61, rMSE = 2.18, R2 = 0.37) (eFigure 3 in the supplement). Also, the results were similar for the model based on a combination of sleep quality and anxiety (r = 0.71, rMSE = 2.42, R2 = 0.50) using the original DSS scores (not excluding two sleep-related questions from the DSS questionnaire) (eFigure 6 in the supplement). Moreover, repeating the analyses based on seven components of PSQI (instead of 19 questions of the self-reported Pittsburgh sleep quality index (PSQI)) also revealed robust results in predicting DSS (r = 0.64, rMSE = 2.32, R2 = 0.41, based on a combination of sleep quality and anxiety) (eFigure 4 in the supplement). The feature importance in the ML model demonstrated that sleep-related daytime dysfunction, sleep disturbance, and subjective sleep quality were more important than other sleep components in predicting DSS (eFigure 5 in the supplement).
ML pipeline for prediction of DSS (depressive symptoms severity) considering family structure. First of all, 10-fold cross-validation was performed in a way that siblings were not separated in training/test sets. After putting aside the test set (of the first fold from now), we performed a 10-fold cross-validation again on the training set (of the first fold) considering family structure. In this stage, we split validation sets and trained models on remained training sets. On each fold we trained models and optimized hyper-parameters ten times with ten different feature numbers. Hence, we had ten folds and ten models for each fold which the algorithm had to select the model with best performance and minimum error across all folds. Subsequently, the selected model was fitted on the entire training set and then evaluated on the test set. This process repeated for all other nine folds (Note: all units in the figure are arbitrary).
Prediction of DSS in HCP-Young dataset. A) prediction based on sleep quality; B) prediction based on a combination of sleep quality and anxiety; C) prediction based on a combination of sleep quality and GMV D) prediction based on a combination of sleep quality and anxiety and GMV.(GMV: gray matter volume, DSS: depressive symptoms severity, r: correlation coefficient between real and predicted DSS, rMSE: root mean squared error, R2: determination coefficient)
Sleep and anxiety predicted DSS in the HCP-Aging and eNKI datasets
Interestingly, we were able to predict DSS in both HCP-Aging and eNKI cohorts using models which were trained by the HCP-Young dataset (Fig. 3A&B). In the HCP-Aging dataset, sleep quality features could predict DSS robustly (r = 0.57, rMSE = 2.64, R2 = 0.27, CI = 3.27 – 3.54). Further, adding anxiety score to sleep quality features could improve the prediction in this dataset (r = 0.72, rMSE = 2.19, R2 = 0.50, CI = 2.97 – 3.33). Adding GMV features to the sleep quality (r = 0.56, rMSE = 2.65, R2 = 0.27, CI = 3.23 – 3.48) and a combination of sleep quality and anxiety score (r = 0.72, rMSE = 2.21, R2 = 0.49, CI = 3.04 – 3.40) provided similar results to the primary (HCP-Young) dataset.
Out-of-cohort validation of ML results in two independent datasets. A) prediction of DSS in HCP-Aging dataset based on sleep quality, a combination of sleep quality and anxiety, a combination of sleep quality and GMV, a combination of sleep quality and anxiety, and GMV; B) prediction of DSS in eNKI dataset based on sleep quality, a combination of sleep quality and anxiety, a combination of sleep quality and GMV, a combination of sleep quality and anxiety, and GMV (GMV: gray matter volume, DSS: depressive symptoms severity, r: correlation coefficient between real and predicted DSS, rMSE: root mean squared error, R2: determination coefficient)
Similarly, in the eNKI dataset, sleep quality predicted DSS (r = 0.50, rMSE = 2.70, R2 = 0.16, CI = 3.54 – 3.85), and a combination of sleep quality and anxiety score predicted DSS (r = 0.66, rMSE = 2.34, R2 = 0.38, CI = 3.31 – 3.73). Adding GMV features to the sleep quality could not improve the prediction (r = 0.51, rMSE = 2.68, R2 = 0.18, CI = 3.49 – 3.78) and a combination of sleep quality, anxiety, and GMV (r = 0.68, rMSE = 2.29, R2 = 0.40, CI = 3.34 – 3.74) revealed the same result as the HCP-Young dataset.
Mediatory role of anxiety and GMV on the link between sleep quality and DSS in the HCP-Young
We observed a significant mediatory role of anxiety in the relationship between sleep quality and DSS (eFigure 1 in the supplement). Anxiety score partially mediated 52.6% of the total effect size (p < 0.001). In this mediation analysis, GMV could not mediate the link between sleep quality and DSS.
Discussion
The main findings of this study pointed out that sleep quality could predict DSS in three independent datasets and adding anxiety (but not GMV) to the sleep quality enhanced such prediction. Our confirmatory ML analyses to consider sleep-related questions of DSS, excluding participants with a depression history, and repeating the analyses based on seven components of PSQI revealed similar results, indicating the robustness of our prediction results. Remarkably, the prediction of DSS based on sleep and anxiety features was supported in two other large-scale datasets, suggesting the generalizability of our ML models. Moreover, we found that anxiety (but not GMV) mediated the link between sleep quality and DSS. To our knowledge, this is the first study that assessed the prediction of DSS based on sleep quality, anxiety, and GMV using the ML models in three independent general population samples.
Our findings are consistent with a body of literature showing that sleep disturbance is associated with depressive problems in previous meta-analyses [11–14]. In large-scale population cohorts, it has also been shown that sleep quality is associated with depressive symptoms [9, 20]. However, our study aimed to predict DSS based on sleep quality in different samples instead of investigating only the correlation between them. Animal models revealed that neonatal sleep disturbance could lead to adulthood depressive symptoms [37, 38]. Longitudinal human studies showed that people with sleep initiation problems might experience depression over the next 3-6 years of their life [39, 40]. Interestingly, toddlers’ sleep problems at the age of 18 months predicted depressive symptoms at the age of 8 years old [41]. Although these studies have not used ML models to be able to predict individual DSS in another sample, they suggest that poor sleep could be a critical predictor for DSS. A recent ML study [42] demonstrated that sleep disorder is one of the most important features to predict depression, particularly in individuals with hypertension. They predicted a binary definition (existence/nonexistence) of depression among adults with hypertension, while our study predicted a wide (0-28) continuous range of depressive symptoms in three databases. Another large-scale ML-based study found that sleep duration is one of the top five predictors of DSS among home-based older adults [43]. Our findings support this hypothesis, although does not show any causality between sleep and DSS. The cross-sectional nature of our study precludes the assessment of the long-term causal pathways in the general population. Thus, the longitudinal role of poor sleep (using both subjective and objective sleep assessments) in developing clinical MDD has to be examined in the future.
In the present study, anxiety improved the prediction of DSS to sleep quality features (r = 0.71) and also had an indirect effect (51.2%) in mediating the link between sleep quality and DSS. The strong interplay between sleep disturbances, anxiety, and depression has been well-documented earlier [18, 44], and our study supports such findings. For example, short and long sleep duration are predictors of depression and anxiety in a large cohort [45]. The additive role of anxiety to sleep in DSS prediction is further supported by the notion that sleep loss increases preemptive responding in the amygdala and anterior insula during the affective anticipation [46]. Poor sleep loss is linked to abnormal activity in the medial prefrontal cortex, amygdala, insula, and anterior cingulate cortex, which were associated with higher levels of next-day anxiety [47]. An earlier study using the HCP-Young sample indicated that functional connectivity between the lateral orbitofrontal cortex, dorsolateral prefrontal cortex, anterior and posterior cingulate cortices, insula, parahippocampal gyrus, hippocampus, amygdala, temporal cortex, and precuneus mediated the effect of sleep quality on DSS [20]. Structural brain alterations in the postcentral gyrus and superior temporal gyrus mediate the link between sleep disturbance and depressive symptoms in a small group of shift-working nurses [48]. Another study observed that the GMV of the right insula mediates the relationship between sleep quality and anxiety/depressive symptoms among college students [49]. However, in the present study, GMV could not predict DSS in any dataset and has a very small mediatory effect on the link between sleep and DSS. One explanation could be the link between sleep disturbance and depressive symptoms anchored in the functional level rather than GMV [19]. However, the neurobiological underpinning mechanism of sleep disorders and depression is still under debate and needs further elaboration. Previous large-scale neuroimaging meta-analyses studies failed to identify a consistent regional abnormality in insomnia disorder and depression [50, 51]. Similarly, ML classification model failed to separate healthy individuals from subjects with insomnia based on brain volumes [52] or to differentiate healthy individuals from patients with depression based on brain structure and function values [53], indicating that neuroimaging derivatives are not optimal features to separate patients with insomnia or depression from healthy subjects.
In this cross-sectional study, we observed that sleep quality and anxiety predict DSS at the individual subject level, but GMV did not contribute to DSS prediction in the HCP-Young sample. Similar patterns were evident in two other samples. Furthermore, anxiety scores (not brain structure) mediated the association between sleep quality and DSS. We hope that our findings, based on three cohorts, incentivize clinicians to consider the importance of screening and treating subjects with sleep and anxiety problems as potential therapeutic targets to reduce the burden of depressive symptoms in our societies.
Author Contributions
Conception and study design: M.O., S.B.E. and M.T. Preprocessing and data analysis: M.O., F.S., S.F., S.M., K.P. Interpretation M.O., S.G., S.B.E., and M.T. Paper writing and editing: all authors.
Conflict of Interest Disclosures
The authors declare no conflicts of interest.
Funding/Support
Drs. Eickhoff and Patil received Helmholtz Imaging Platform grant (NimRLS, ZT-I-PF-4-010).
Role of the Funders/Sponsor
The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and the decision to submit the manuscript for publication.
Data Sharing Statement
We used the public datasets from the Human Connectome Project (https://www.humanconnectome.org/) and the enhanced Nathan Kline Institute-Rockland sample (eNKI) (http://fcon_1000.projects.nitrc.org/indi/enhanced/).
Online supplement
eMethod
eFigure 1. Mediation analysis of GMV and anxiety on the link between sleep quality and DSS.
eFigure 2. Correlation between sleep quality features.
eTable 1. Sleep quality assessment using PSQI.
eFigure 3. Prediction of DSS after excluding participants who experienced depression.
eFigure 4. Prediction of DSS based on seven components of PSQI.
eFigure 5. Feature importance of 7 PSQI components.
eFigure 6. Prediction of DSS after including sleep-related items of DSS.
eFigure 7. Prediction of DSS based on GMV, anxiety, and combination of GMV and anxiety.
eReferences
Participants and data preprocessing
This investigation had 1101 subjects from the HCP-Young dataset for main analyses and 334 subjects from the eNKI, and 378 subjects from the HCP-Aging datasets for out-of-cohort validation of trained ML models, obtained from the HCP-Young. All participants who had phenotypic and neuroimaging data were included in this study, and there were no inclusion/exclusion criteria based on a diagnosis of sleep disturbances, anxiety, or depression. The phenotypical data (sleep quality, anxiety, DSS) were measured by the same questionnaires in all three datasets. In addition, the HCP-Young dataset has provided information about the history of diagnosed depression with the question “Has the participant experienced a diagnosed DSM-IV major depressive episode over his/her lifetime?”. We excluded participants who had at least one episode of depression in confirmatory analyses (eFigure 3) to test the robustness of models for participants who never had a history of depression.
In addition, we used structural MRI images of the HCP-Young dataset, which were acquired by Siemens 3T Skyra scanner and preprocessed using the HCP pipelines [1]. These T1 weighted MRI images were collected with 0.7mm voxel size isotropic resolution, time of repetition (TR) = 2400 ms, time of echo (TE) = 2.14 ms, time of inversion (TI) = 1000 ms, with flip angle of 8 degrees, and the field of view was 224×224 mm [2].
Mediation findings
Standard mediation analyses were performed using Amos v.24 software, which is a powerful software in path analysis and SEM. In all mediation analyses, age, sex, and total GMV were controlled. The significance of the models was tested by bias-corrected bootstrapping using 5000 random subsampling.
The main hypothesis of the mediation analyses was to assess the underlying mechanisms of the association between sleep quality and DSS. Hence, we made a latent variable of brain GMVs which had a significant correlation with sleep quality, and used it in the model as GMV. Of note, there was no GMV parcel correlated with DSS or anxiety. Since anxiety scores were not correlated with GMV, the mediational role of anxiety scores was assessed in parallel with the latent variable of GMV eFigure 1.
Mediation analyses of GMV and anxiety. The standardized total effect of sleep quality on DSS is 0.38, and the direct effect is 0.18. The standardized indirect effect of sleep quality on DSS is 0.20. The direct effect of sleep quality on anxiety is 0.32, and on GMV is 0.-0.17. In addition, the direct effect of anxiety on DSS is 0.60, and the direct effect of GMV on DSS is −0.44, but it is insignificant. Hence, GMV has no mediatory role in the link between sleep quality and DSS, but anxiety has a partial mediatory role (52.6% of total effect size) in the link between sleep quality and DSS.
Comparing the results of mediational analyses, we found anxiety problems score as a strong mediator of the link between sleep quality and DSS, while GMV has no mediatory role in the link between sleep quality and DSS.
Machine learning-based prediction
ML models
The ensemble decision tree model was used in this investigation, which is one of the most interpretable and powerful ML technics available. Hyperparameters of these models were the ensemble aggregation method (bagging/LS-Boost), number of ensemble learning cycle [10,50], learning rate (0,1], and minimum number of leaf node observations. Interestingly, in all models, LS-Boost was selected as the best method in the optimization process. ML pipeline in this investigation consisted of three sequential steps (cross-validation, feature selection, model training, and evaluation):
Cross-validation
First of all, a standard nested 10-fold cross-validation was performed to assess the generalizability of models. The following protocol was applied for nested 10-fold cross-validation: HCP subjects were divided into 10 non-overlapping folds. Each fold was used as a held-back test set, whilst all other folds collectively constructed the training set. Another 10-fold cross-validation was applied to each training set, and made 10 validation sets in each training set. Further, to consider the family structure of HCP subjects in ML analyses, we paid special attention that siblings were not separated in train/validation/test sets.
Feature selection
In models with 473 GMV features (Fig 3B&D), a filter-based feature selection method was applied in order to reduce the computational cost, prevent overfitting, and improve model performance. Hence, features of training sets were ranked and scored by the relief method (If there is a feature value difference in a neighboring instance pair with a different target, the relief method increases the feature rank [3]). Afterward, 10 different numbers of top-scoring features (20, 30, 40, 50, 60, 70, 80, 90, 100, 110 highest-scoring features) were checked in each inner fold for model training so the algorithm could select the best feature number in each fold automatically. Although, in models with 19 PSQI or 20 PSQI and anxiety features (Fig 3A&C), we did not perform feature selection. Further, we checked the non-existence of feature redundancy between sleep quality features by correlation analysis eFigure 2.
Model training and evaluation
Model training and hyperparameter optimization with 100 iterations were performed in the inner loop of cross-validation and repeated 10 times with 10 different feature numbers. Therefore, a total of 1000 models were evaluated by validation sets, and then 10 models with the minimum MSE were selected. These models were fitted on the entire training sets and evaluated by the unseen test sets. Finally, the average performance of models is reported in Fig. 3.
Confirmatory machine learning analysis
We performed three confirmatory analyses in the HCP-Young dataset to show the robustness of our results. In the first stage of confirmatory analyses, we removed 103 participants with a history of depression (eFigure 3). Importantly, results of this stage were significant in which prediction based on sleep quality resulted in (r = 0.39, rMSE = 2.53, R2 = 0.15) and based on both sleep quality and anxiety resulted in (r = 0.61, rMSE = 2.18, R2 = 0.37). In the second stage, we assessed the predictability of DSS based on 7 components (subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, use of sleep medicine, and daytime dysfunction) of PSQI (eFigure 4) and feature importance of this model has been shown in eFigure 5. In the third stage of confirmatory analyses, we included two sleep-related items of DSS which had been removed at first. Sure enough, including sleep-related items of DSS improved predictability. The results of these analyses are provided in eFigure 6. Further, in other analyses, we examined the predictability of DSS based on only GMV and/or anxiety features (eFigure 7) which clarified that GMV alone could not predict DSS.
Out-of-cohort validation
To test the generalizability of our findings, we predicted DSS in the eNKI and HCP-Aging datasets based on ten models which had been trained on the HCP-Young dataset. In this way, every single model was used to predict DSS, and predictions of all ten models were averaged for each sample. Of note, features in out-of-cohort validation and main predictions were the same.
Correlation between sleep quality features. All correlations are significant p-value < 0.05 (PSQI: Pittsburgh Sleep Quality Index). For more details see eTable 1.
Prediction of DSS based on sleep quality and combination of sleep quality and anxiety in the HCP-Young dataset. Results of prediction after excluding 103 participants who experienced at least one episode of depression (DSS: depressive symptoms severity, r: correlation coefficient between real and predicted DSS, rMSE: root mean squared error, R2: determination coefficient).
Prediction of DSS based on seven components (subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, use of sleep medicine, and daytime dysfunction) of PSQI in the HCP-Young dataset (PSQI: Pittsburgh Sleep Quality Index, DSS: depressive symptoms severity, r: correlation coefficient between real and predicted DSS, rMSE: root mean squared error, R2: determination coefficient).
Feature importance in prediction of DSS (depressive symptoms severity) based on seven components of sleep quality.
Prediction of DSS in the HCP-Young dataset after including two sleep-related items of DSS (DSS: depressive symptoms severity, GMV: gray matter volume, r: correlation coefficient between real and predicted DSS, rMSE: root mean squared error, R2: determination coefficient).
Prediction of DSS based on GMV, anxiety, and a combination of them in the HCP-Young dataset (GMV: gray matter volume, DSS: depressive symptoms severity, r: correlation coefficient between real and predicted DSS, rMSE: root mean squared error, R2: determination coefficient).