ABSTRACT
IMPORTANCE Electroconvulsive therapy (ECT) is an effective medical procedure for patients with treatment-resistant depression. However, quantitative neural and behavioral measures that characterize how patients respond to ECT treatment are largely lacking.
OBJECTIVE Determine whether neurocomputational models that integrate information about adaptive learning behavior and associated affective experiences can characterize neurobehavioral changes in patients whose depression improves following ECT treatment.
DESIGN This observational study included two research visits from 2020-2023 that occurred before and after standard-of-care ECT for treatment-resistant depression. This report focuses on “visit 2”, which occurred after patients received their initial ECT treatment series.
SETTING Wake Forest University School of Medicine; Atrium Health Wake Forest Baptist Psychiatric Outpatient Center; Atrium Health Wake Forest Hospital.
PARTICIPANTS Participants who received ECT for treatment-resistant depression (“ECT”), and participants not receiving ECT but with depression (“non-ECT”) or without depression (“no-depression”) were recruited from the Psychiatric Outpatient Center and community, respectively.
EXPOSURES Computerized delivery of a Probabilistic Reward and Punishment with Subjective Rating task with functional magnetic resonance imaging.
MAIN OUTCOMES AND MEASURES Computational modeling of choice behavior provided parameters that characterized learning dynamics and associated affect dynamics expressed through intermittent Likert scale self-reports. Multivariate statistical analyses relating model parameters, neurobehavioral responses, and clinical assessments.
RESULTS ECT (N=21; 47.6% female), non-ECT (N=36; 69.4% female), and no-depression (N=38; 65.8% female) participants. Parameters derived from computational models fit to behavior elicited during learning and the expression of affective experiences for all groups reveled specific changes in patients who responded favorably to ECT. ECT-responders demonstrated increased rates of learning from rewarding trials, normalized affective response to punishments, and an increase in the influence of counterfactual ‘missed opportunities’ on affective behavior. Additionally, ECT-responders’ showed changes in BOLD activity regions specific to each of these parameters. ECT-responders’ BOLD-responses to surprising punishments and counterfactual missed opportunities were altered from visit 1 to visit 2 in the inferior frontal operculum, Rolandic operculum, precentral gyrus, and caudate.
CONCLUSIONS AND RELEVANCE Computational models of neurobehavioral dynamics associated with learning and affect can describe specific hypotheses about neurocomputational-mechanisms underlying favorable responses to treatment-resistant depression. Our results suggest computational estimates of learning and affective dynamics may aid in identifying depression phenotypes and treatment outcomes in psychiatric medicine where objective measures are largely lacking.
QUESTION How does ECT treatment change neurocomputational measures of learning and affective experiences in patients with treatment-resistant depression?
FINDINGS In this observational study, computational models were used to quantify the behavioral dynamics of learning and associated changes in subjective feelings in patients who underwent ECT treatment for treatment resistant depression and controls. In ECT-responders we observed increases in reward-based learning, normalized affective responses to surprising positive and negative outcomes, and associated changes in fMRI-measured BOLD-responses.
MEANING Computational phenotyping of task behavior and associated brain responses provides quantification of complex neurobehavioral dynamics and provides specific insight into the neurobehavioral mechanisms underlying successful ECT treatment.
INTRODUCTION
Electroconvulsive therapy (ECT) is an effective medical procedure for patients with treatment-resistant depression (TRD) who are unresponsive to standard antidepressants1,2. Despite decades of use, it is still unclear how ECT-induced changes in brain function give rise to behavioral changes in treatment responders3–5. Objective measures that link neural and behavioral changes produced by ECT treatment may provide a precise explanation of how ECT improves depressive symptoms in patients with TRD6–8.
Computational models of reinforcement learning (RL) have been increasingly applied to better understand depression pathophysiology and treatment mechanisms9–11. Further, quantifying links between measurable and computable learning signals and subjective emotional states has started to reveal how depression impacts the emotional processing of decision outcomes12–15. However, to date, no study has applied neurocomputational depictions of learning and affective experiences to characterize ECT treatment outcomes. The goal of this study was to combine computational models of learning and affective dynamics with functional magnetic resonance imaging (fMRI) to characterize neurobehavioral changes in patients with TRD following their initial ECT treatment series. Specifically, we used a Valence Partitioned Reinforcement Learning (VPRL) model recently shown to explain human choice behavior better than single-valence RL models and that was shown to explain sub-second changes in dopamine fluctuations in humans performing the same task that is used in this work16–18.
While computational RL methods describing post-ECT effects are lacking, other clinical treatments have been shown to improve reward learning deficits in non-treatment resistant depression19–24. For example, antidepressants increase neural responses to reward prediction errors in responsive patients25–27. However, there are variable findings regarding punishment learning28–30. For example, multiple fMRI studies using similar tasks report inconsistent BOLD responses to ‘negative reward prediction errors’ in unmedicated patients with depression31–35. Further, antidepressant studies also report increased36, decreased37, or no change38,39 in patients’ punishment learning when considering ‘negative reward prediction errors’ and the punishment signal.
Brown and colleagues recently separated trials within their RL tasks according to whether the trial was rewarding or punishing (e.g., gains vs. losses, respectively) and demonstrated normalized reward and loss neurocomputations after cognitive behavioral therapy in patients with depression9. This separation of valence aligns with recent evidence suggesting positive and negative valence processing occurs through separate neural systems, which differs from traditional single-valence RL approaches28,40–42. The VPRL framework used in this study follows this line of reasoning and hypothesizes that independent neural systems track positive and negative events whether anticipated or actually experienced16–18,28. We hypothesize that applying VPRL to investigate ECT treatment outcomes may clarify how ECT alters distinct positive and negative learning mechanisms in patients with TRD.
In addition to reinforcement learning and decision making mechanisms, further investigating how positive and negative learning signals differentially influence affective states using neurocomputational models, may identify specific affective mechanisms altered by ECT treatment43–48. For example, Eldar and Niv demonstrated that positive prediction errors improved mood while negative prediction errors worsened mood during a learning task, with emotional states further biasing valuations of subsequent outcomes in individuals with mood instabilities13. Also, Rutledge and colleagues showed that reward expectations and reward prediction errors directly affected mood ratings about recent outcomes in patients with depression12. While such studies dig deeper into the influence of decision outcomes on affective state, there remains a gap in the literature describing how neurobehavioral computations may give rise to affective experiences and how these processes are altered in depression and perhaps favorably modulated by effective treatments including ECT49,50.
In this study, we applied a VPRL framework17,18 to determine whether a computational psychiatric approach51,52 that uses valenced-partitioned (i.e., positive and negative systems) learning signals to predict affective behavior could provide insight into how ECT treatment changes learning and affective dynamics in patients with favorable responses. We hypothesized that ECT would alter neurobehavioral signals related to both learning and affective behavior in ECT treatment responders.
METHODS
Study Design
Here, we report data collected during research “visit 2” of a two-visit observational study following a standard-of-care ECT-treatment timeline (Fig.1A). All participants completed a Probabilistic Reward and Punishment with Subjective Rating task during fMRI scanning. Following fMRI scanning, participants completed the Patient Health Questionnaire 9 (PHQ-9)53, Hamilton Depression Rating Scale (HAM-D)54, and Montreal Cognitive Assessment (MOCA)55. Participants provided written informed consent under Wake Forest University School of Medicine IRB00056131.
Participants
“ECT” patients included patients with TRD who received ECT treatment for the first time at AHWFB Psychiatric Outpatient Center (ECT, N=21; 47.6% female). We defined ECT treatment responders as patients who showed any clinical improvement following their standard-of-care ECT treatment series based on clinician notes or PHQ-9 or HAM-D assessments (ECT Responder, N=17; ECT Non-responder, N=4). Participants with depression not planning ECT (non-ECT, N=36; 69.4% female) and participants without depression (no-depression, N=38; 65.8% female) were recruited from the Winston-Salem, North Carolina area. See eMethods and eTables1-3 for full participant details.
Probabilistic Reward and Punishment with Subjective Rating (PRPwSR) Task
Participants completed the same PRPwSR task completed at research visit 1 (Fig.1B)17,18. Briefly, the PRPwSR task is a value-based choice task where participants make choices and learn to maximize probabilistic rewards (i.e., monetary gains) and minimize probabilistic losses (i.e., monetary losses) over 150 trials. After each trial there is a one-third probability that participants would be asked, “How do you feel about the last outcome?”. Participants respond using a Likert scale ranging from “very bad” to “very good.” See eMethods and eFig.1 for task details. See eFig.2 for group performance measures.
Computational Modeling
We used hierarchical Bayesian methods56 to fit RL computational models to participants’ choice behavior on the PRPwSR task (eMethods). Participants’ expected icon values and outcome prediction errors were estimated using a VPRL framework16–18. These learning signals were then used to model and predict participants’ subjective ratings during the task. We hypothesize that each individual’s unique set of learning and affective parameters could represent a computational phenotype and that ECT responders’ computational phenotype would change relative to pre-ECT-treatment. See below for brief descriptions of model parameters:
Valence Partitioned Reinforcement Learning Model Parameters
A previously validated VPRL framework was used to quantify participants’ choice behavior16–18. VPRL hypothesizes that brains track appetitive (e.g., positive/rewarding) and aversive (e.g., negative/punishing) stimuli simultaneously, but via independent systems. This allows stimuli to predict benefits and/or costs that may be independently estimated such that cost-benefit comparisons can be made16. Each system updates value estimates (i.e., Q-values) akin to standard Q-learning with temporal difference RL rules57. Hence, Q-values for positive and negative state-action sets (st, at) are estimated. Likewise, future states’ (St+1) positive and negative values are estimated and, respectively, discounted by independent parameters (γP,γN). The combination of current actual outcomes, discounted expectations of future values, and expected current values are used to calculate independent temporal difference prediction errors , which are then used to update respective Q-values by independent learning weights (αP, αN). We modeled choice policy using a softmax function with a temperature parameter (τ) random a participant’s choices are given Q-value estimates. The parameters in the VPRL framework (αP, αN, γP, γN, and τ) were used to characterize each participant’s computational learning phenotype. See eMethods for details about the VPRL model.
Subjective Feeling Regression Model Parameters
We hypothesized that participants’ expectations about icon values and prediction errors resulting from choice outcomes collectively contributed to their reported feelings in the PRPwSR task18. We also hypothesized that the contribution of learning signals to subjective ratings in ECT-responders would differ from those observed before ECT treatment. We fit a linear regression model58 with Q-values and prediction errors as independent predictors of subjective ratings. The coefficient parameters in the Subjective Feeling model were used to characterize each participant’s computational affective phenotype. See eMethods for details about the Subjective Feeling model.
fMRI Analyses
See eMethods for details on fMRI data acquisition, pre-processing, and model-based analyses. VPRL and Subjective Feeling models were fit to each participants’ behavior. We developed second-level “visit 2” minus “visit 1” contrasts of specific learning and affective neurocomputations that showed behavioral changes across research visits and performed whole-brain i) one-sample t-tests of blood-oxygen-level-dependent (BOLD) changes for ECT responders and ii) analysis of the variance (ANOVA) of BOLD changes between ECT, non-ECT, and no-depression groups. All statistical analyses were conducted at an uncorrected threshold of p<0.001 and reported results were selected using a family-wise error (FWE)-corrected threshold of p<0.05 at cluster and peak voxel levels.
Statistical Analysis
We used hierarchical Bayesian analysis56 to estimate posterior distributions of free parameters in the VPRL model (αP, αN, γP, γN, and τ) and subsequently conducted a Bayesian linear regression58 with group-informed posterior coefficients in the Subjective Feeling model . For all relevant tests, p<0.05 was deemed significant. Analyses were conducted in R version 4.2.2 and Stan version 2.21.0. (rstan version 2.21.8)59.
RESULTS
Participant characteristics
Across research visits, both ECT and non-ECT groups demonstrated decreases in PHQ-9 and HAM-D scores, indicating depression symptom improvement. ECT patients showed a substantial reduction in PHQ-9 scores compared to both non-ECT and no-depression groups but a substantial reduction in HAM-D scores compared to no-depression participants only. The three groups did not differ in age, gender, race, or ethnicity. See eTable4 for clinical details.
ECT Responder Computational Phenotypes for Valence Partitioned Learning
Behavior
We hypothesized that ECT treatment altered neurobehavioral learning mechanisms in responders. To test this hypothesis, we fit the VPRL model separately to ECT responder, ECT non-responder, non-ECT, and no-depression group choice behavior for both research visits and assessed group-level changes in learning parameters across visits.
ECT responders showed an increase in the positive system learning rate (αP, Table 1A) following treatment (median difference [95% HDI] = 0.15[−0.05, 0.35]). No-depression participants also showed some increase in αP (0.02 [−0.04, 0.09]) while non-ECT participants showed decreased αP across research visits (−0.09 [−0.15, −0.02]). Non-ECT and no-depression groups also showed decreases in 1/.t (non-ECT: −0.08 [−0.20, 0.05]; no-depression: −0.04 [−0.13, 0.04]) across visits.
Neural
Given the change in αP among ECT responders, we performed a one-sample t-test of BOLD change across research visits (visit 2-visit 1) associated with positive system reward prediction errors in ECT responders (see Supplementary Eq.1-2). However, we did not find significant changes in this activity following treatment (Table 2A). We then tested whether BOLD activity associated with changed more in the ECT cohort compared to non-ECT and no-depression groups. We did not find significant differences between the three cohorts after conducting a whole-brain ANOVA on visit 2-visit 1 change in BOLD-associated activity (eTable5A).
ECT Responder Computational Phenotypes for Subjective Feelings
Behavior
We hypothesized that i) VPRL learning signals may drive changes in subjective feelings about the consequences of choices made (Fig.1B) and ii) ECT treatment would alter this affective mechanism in responders for specific learning signal contributions on reported feelings. We expected a linear combination of these signals: to predict subjective ratings.
In ECT-responders only, receiving worse than expected punishments contributed to more negative ratings of feelings after treatment (median difference [95% HDI] = −0.65 [−1.09, −0.22], Table 1B). Both ECT responders and non-ECT participants demonstrated a shift from a negative to positive influence of positive system counterfactual choice expectations on feelings across visits (ECT responders: 1.27 [−0.04, 2.64]; non-ECT: 0.84 [0.10, 1.58]). Alternatively, both ECT responders and no-depression participants shifted from a positive to negative influence of ‘less-bad than expected’ punishments on rated feelings (ECT responders: −0.57 [−1.14, −2.76e-4]; no-depression: −0.58 [−1.06, −0.10]). No-depression participants demonstrated more negative feelings derived from less rewarding outcomes and more positive feelings from positive system chosen choice expectations .
Neural
Given the changes in the Subjective Feeling model parameters we observed in ECT responders, we performed a one-sample t-test to determine BOLD activity change associated with subjective ratings influenced by across research visits (visit 2-visit 1). We found BOLD-associated changes in all parameters for ECT responders, with increased activity in the right inferior frontal operculum and decreased activity in the left rolandic operculum, left caudate, and right precentral gyrus following treatment (Fig.2; Table 2B).
To assess whether BOLD-associated changes in subjective feeling computations differed across ECT, non-ECT, and no-depression groups, we performed a whole-brain ANOVA based on behavioral results (we again assessed for group differences in emotional impacts of . We found differences in BOLD-associated punishment prediction error changes across the groups (eTable5B). Post-hoc t-tests showed greater changes in BOLD-related activity for non-ECT participants compared to ECT patients in the right calcarine gyrus, right precuneus, and left posterior cingulum and for no-depression participants compared to ECT patients in the right angular gyrus.
DISCUSSION
We investigated whether and how ECT treatment changed computational phenotypic depictions of the neurobehavioral dynamics of reward and punishment learning and associated subjective experiences in patients with TRD (who were previously naïve to ECT). We assessed specific changes in ECT-responders, defined by any clinical improvement in depression, participants with depression managed by medications or therapy, and participants without depression to pinpoint treatment specific effects versus general neurobehavioral changes that may have occurred across research visits. We used a hierarchical Bayesian approach to fit a Valence Partitioned Reinforcement Learning (VPRL)16–18 model to participant choice behavior on a Probabilistic Reward and Punishment with Subjective Rating (PRPwSR) task (Fig.1B). This model provided parameters describing learning mechanisms in which ECT responders demonstrated an increase in reward learning rate following treatment (Table 1A). From VPRL parameters, we then estimated participants’ expectations and prediction errors to model their influence on participants’ subjective feelings of decision outcomes during the task. Following treatment, ECT-responders demonstrated unique shifts in how unchosen potentially good (i.e., ‘missed opportunities’) and actual worse-than-expected losses affected their feelings, suggesting a treatment effect in positive and negative RL-affective systems (Table 1B). These specific changes in subjective experience were further associated with changes in BOLD-responses associated with hypothesized affective processes (Fig.2). Our collective results suggest that specific neurobehavioral mechanisms underlying learning and affective processes are altered by successful ECT treatment. Further, these signals may provide a potential quantitative measure for ECT-responders and a favorable ECT response.
To our knowledge, our study is the first to apply a computational psychiatric approach to investigate ECT treatment mechanisms in patients with TRD, and to pair VPRL16–18 and related models to extract subjective experience16,18 in patients with depression. ECT responders demonstrated increased reward learning rates following treatment compared to other cohorts, suggestive of an ECT treatment effect that may allow relearning of the relationship between potentially good experiences and behaviors that promote them. This is consistent with prior work demonstrating that various types of antidepressants can improve reward learning in responsive patients with depression19,26,27, but demonstrates the role that ECT can have in this specific process in treatment resistant patients. Notably, the ECT patients in this study had tried multiple (and were currently taking) antidepressants that were ineffective and therefore proceeded to ECT treatment. These findings suggest ECT may be effective in targeting similar neural circuits associated with reward processing that the mechanisms of antidepressants could not alter for patients with TRD in our study29,38. We did not find neural correlates to reward prediction error signaling improvements in ECT treatment responders, indicating further work is needed to pinpoint neural effects of reward learning from ECT.
We found both neural and behavioral changes in how learning signals come to drive affective responses in ECT responders. These results suggest that ECT may have changed mechanisms underlying how good and bad outcomes are processed in patients and how they come to affect subjective feelings. Interestingly, counterfactual ‘missed opportunities’ showed a significant increase in the influence on positive feelings in both ECT-responders and no-depression groups across visits. Originally this parameters promoted strong negative feelings in responders, but after treatment, promoted positive feelings similar to what we observed in no-depression participants. Individuals with depression often experience challenges in experiencing pleasure or motivation (i.e., anhedonia). Despite not seeking rewards, this may cause rumination over missed opportunities and lead to negative feelings7,60,61. Our results suggest ECT may normalize this affective mechanism and involve regions such as the caudate and rolandic operculum known to be involved with reward seeking and emotional processing62,63. Further, ‘worse-than-expected’ punishments contributed to more negative feelings after ECT treatment in responders (compared to before ECT). Negative bias is a known depressive symptom where maintained expectations about negative events often lead to unsurprising emotional reactions when such events arise, often contributing to flatter affect in patients with depression64–66. As such, ECT may alter cognitive-emotional processes, previously shown to be tracked by the inferior frontal operculum67, translating to increased emotional reactivity to punishments in responders.
The VPRL framework16–18 used here allowed us to identify potential computational phenotypes of ECT responders under hypotheses of independent reward and punishment learning systems in the brain. We identified specific learning and affective mechanisms that accompanied depression improvement in responders, which provided an objective, translatable understanding of successful ECT mechanisms51. Importantly, we recently demonstrated that the VPRL model tracks sub-second fluctuations in dopaminergic concentrations in the human brain using the same task used in this study17. And prior work has suggested that counterfactual signals can have a significantly influence on sub-second dopamine signals in humans68. In our visit 1 (pre-ECT) work, we showed that ‘worse-than-expected’ punishment prediction errors accounted for differences in ‘pre-ECT’ patients with TRD; here, we show that these differences were normalized in ECT treatment responders and correlated these changes to increased BOLD activity in the inferior frontal operculum. These results are consistent with prior work that suggested that punishment learning processes may be altered in depression9 but extend these studies by suggesting specific neurocomputational mechanisms underlying ECT symptom improvement. More work is needed to clarify the specificity of our proposed computational phenotypes of ECT treatment responders and whether these tools may be used to augment clinical work.
Limitations
Our sample sizes for ECT non-responder group were small (N=4) and we did not have enough statistical power to run separate analyses to compare of investigate effects in non-responders. We aimed to observe the natural course of the current standard of care for ECT, therefore, ECT sessions varied by patient due to individual treatment plans that the research team was not involved with; however, this variation likely contributed to the overall successful treatment outcomes. Given this variation in treatment, but uniformity of treatment outcomes, our results may point to a general neurobehavioral state that may be a target for successful treatment. More work is needed to clarify these observations and test these hypotheses.
Conclusions
We estimated the magnitude of dynamic neurobehavioral changes in patients receiving ECT by using neurocomputational models of learning and affective processes fit to behavioral data from patients with TRD. This study demonstrates the utility of applying computational models to assess dynamic behavioral, neural, cognitive, and emotional processes that cannot not be captured by current clinical assessments. Future work developing computationally derived estimate of neurobehavioral signals relevant to clinical decisions and patient behaviors is needed to further advance objective and quantitative tools into psychiatry where subjective reports and trial-and-error methods are the current standard of care.
Data Availability
All data produced in the present study are available upon reasonable request to kkishida{at}wakehealth.edu
Author Contributions
Kishida and Jones had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Kishida, Gligorovic, Douglas, Jones, Sands.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Jones, Kishida.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Jones, Sands, Trattner.
Obtained funding: Kishida.
Administrative, technical, or material support: All authors.
Study supervision: Kishida, Gligorovic, Douglas, Ramos, Jones.
Conflict of Interest Disclosures
None reported.
Funding/Support: Funding/Support
This work was supported by the National Institute of Health (grants R01DA048096, R01MH12099, R01MH124115, and P50AA026117 to Dr. Kishida; grant F31DA053174 to Dr. Sands). This work was also supported by Wake Forest University School of Medicine: Clinical and Translational Science Institute; and Wake Forest University School of Medicine: Neuroscience Clinical Trial and Innovation Center.
Role of the Funder/Sponsor
The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Additional Contributions
We thank all of the individuals who took part in this study, clinic staff who assisted with recruitment, and MRI staff who assisted with research visits. We thank Dr. Mary Moya-Mendez for her work in advertisement and IRB development. The following research assistants from the Translational Neuroscience Department at Wake Forest School of Medicine were involved in data collection: Ashley Shipp, BS and Michael Howze, BS.